Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Amount of data Expand / Collapse
Author
Message
Posted Sunday, November 25, 2012 12:00 PM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Sunday, February 9, 2014 8:27 PM
Points: 198, Visits: 223
I was wondering if there is a way to "trap" or track the amount of data (in bytes) that a select statement pulls when querying a table or tables. I'm trying to show some co-workers that using "select *" instead of "select <column_name>" is better. The only way I feel I can accomplish this is to show them the difference in the amount of data that they are pulling.

Thanks in advance



Post #1388414
Posted Sunday, November 25, 2012 12:12 PM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Today @ 4:00 PM
Points: 42,301, Visits: 35,356
You can look at the execution plan, you're looking for the last arrow, the one pointing at the select operator. It is an estimate, not an actual, assuming stats are accurate it should be a decent estimate if the query is simple.


Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #1388415
Posted Monday, November 26, 2012 6:04 AM


SSCoach

SSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoach

Group: General Forum Members
Last Login: Today @ 12:05 PM
Points: 15,417, Visits: 27,820
You could take a look at the sys.dm_exec_query_stats information. You can see the total_logical_reads for all executions of the query. Compare the output of the two, possibly using the number of executions, also there, to get an average.

----------------------------------------------------
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood..." Theodore Roosevelt
The Scary DBA
Author of: SQL Server 2012 Query Performance Tuning
SQL Server 2008 Query Performance Tuning Distilled
and
SQL Server Execution Plans

Product Evangelist for Red Gate Software
Post #1388570
Posted Monday, November 26, 2012 3:13 PM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Yesterday @ 5:33 PM
Points: 1,207, Visits: 2,506
sqldummy79 (11/25/2012)
I was wondering if there is a way to "trap" or track the amount of data (in bytes) that a select statement pulls when querying a table or tables. I'm trying to show some co-workers that using "select *" instead of "select <column_name>" is better. The only way I feel I can accomplish this is to show them the difference in the amount of data that they are pulling.

Thanks in advance


Wait - whaaa? How is using "select *" better than using "select <column_name>"?

Selecting every column (i.e., "SELECT *") is going to result in, at a minimum, the retrieval of rows from the leaf level of the clustered index or from the heap (or from the leaf level of a non-clustered index that includes all columns of the table). If the optimizer uses a non-clustered index that is not a covering index in evaluating the WHERE clause or join conditions, there will be a index scan or seek followed by a key lookup, which means that SQL Server must read even more pages to get to the final result set.

On the other hand, "SELECT <column_name, n>" may require SQL Server to read less data than "SELECT *" when a covering index exists that includes fewer than all the columns of the table and the optimizer can use it to satisfy the query. SQL Server will read the data from the leaf level of the covering index, where the rows will be narrower than the clustered index/heap rows. More rows may fit on each page, so SQL Server may be able to satisfy the query by reading fewer pages.

While there may be some use cases that demonstrate the opposite, I feel fairly confident that "SELECT <column_name, . . . n>" queries will generally require fewer page reads than "SELECT *" queries when well-constructed covering indexes are available. In the absence of covering indexes, SELECT <column_name, . . . n>" queries generally should not require SQL Server to read more data than "SELECT *" queries.

If the OP's question pertains to the amount of data returned by SQL Server to the requesting application or client (i.e., the number of bytes SQL Server sends through the network pipeline), then surely "SELECT <column_name, . . . n" will return fewer bytes than "SELECT *" where n < the total number of columns in the table.

I hope that the OP simply mistyped his question and meant to say that "SELECT <column_name>" will be better than "SELECT *". There are many more, and more important, reasons that this is true. If I'm missing something here, please show me where I've gone astray.


Jason Wolfkill
Blog: SQLSouth
Twitter: @SQLSouth
Post #1388870
Posted Monday, November 26, 2012 3:36 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 5:03 PM
Points: 6,085, Visits: 7,104
wolfkillj (11/26/2012)

I hope that the OP simply mistyped his question and meant to say that "SELECT <column_name>" will be better than "SELECT *". There are many more, and more important, reasons that this is true. If I'm missing something here, please show me where I've gone astray.


You haven't missed anything. SELECT * is never better, both mechanically and for long-term support for eventual schema changes.



- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions | Forum Netiquette
For index/tuning help, follow these directions. |Tally Tables

Twitter: @AnyWayDBA
Post #1388875
Posted Monday, November 26, 2012 10:05 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 9:15 PM
Points: 36,598, Visits: 31,033
sqldummy79 (11/25/2012)
I was wondering if there is a way to "trap" or track the amount of data (in bytes) that a select statement pulls when querying a table or tables. I'm trying to show some co-workers that using "select *" instead of "select <column_name>" is better. The only way I feel I can accomplish this is to show them the difference in the amount of data that they are pulling.

Thanks in advance


SET STATISTICS IO ON is likely what you're looking for.

I believe that your logic is flawed though. Your statement is true only if the idexes on the table are not used even on a small table such as the following...

USE AdventureWorks;

SET STATISTICS IO ON
--===== All columns from an indexed table (9 reads)
SELECT * FROM HumanResources.Employee;

--===== All columns from an identical but non-indexed table (8 reads)
SELECT * FROM #NoIndexes;

--===== 2 columns from an indexed table (5 reads)
SELECT EmployeeID, LoginID
FROM HumanResources.Employee;

--===== 2 columns from an identical but non-indexed table (8 reads)
SELECT EmployeeID, LoginID
FROM #NoIndexes;
---------------------------------------------------------------------------
--===== All columns from an indexed table (2 reads)
SELECT * FROM HumanResources.Employee WHERE EmployeeID = 137;

--===== All columns from an identical but non-indexed table (8 reads)
SELECT * FROM #NoIndexes WHERE EmployeeID = 137;

--===== 2 columns from an indexed table (2 reads)
SELECT EmployeeID, LoginID
FROM HumanResources.Employee
WHERE EmployeeID = 137;

--===== 2 columns from an identical but non-indexed table (8 reads)
SELECT EmployeeID, LoginID
FROM #NoIndexes
WHERE EmployeeID = 137;

If you compare what looks like a savings in the number of reads (for the single row criteria examples) with the last arrow to the left of the "Actual Execution Plan", as Gail suggested, you will see that even the single row criteria returned less (estimated, in this case) data and less data is better for the "Pipe".

Unless you're thinking of something else, the code above (along with the "Actual Execution Plan") proves that including only the columns you need is better for the "Pipe" whether you have an index or not and whether you have criteria or not.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1388935
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse