|
|
|
SSC-Enthusiastic
      
Group: General Forum Members
Last Login: Thursday, March 07, 2013 7:55 AM
Points: 198,
Visits: 205
|
|
I was wondering if there is a way to "trap" or track the amount of data (in bytes) that a select statement pulls when querying a table or tables. I'm trying to show some co-workers that using "select *" instead of "select <column_name>" is better. The only way I feel I can accomplish this is to show them the difference in the amount of data that they are pulling.
Thanks in advance
|
|
|
|
|
SSC-Dedicated
           
Group: General Forum Members
Last Login: Today @ 12:59 PM
Points: 37,640,
Visits: 29,895
|
|
You can look at the execution plan, you're looking for the last arrow, the one pointing at the select operator. It is an estimate, not an actual, assuming stats are accurate it should be a decent estimate if the query is simple.
Gail Shaw Microsoft Certified Master: SQL Server 2008, MVP SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
We walk in the dark places no others will enter We stand on the bridge and no one may pass
|
|
|
|
|
SSChampion
        
Group: General Forum Members
Last Login: Today @ 4:50 AM
Points: 13,371,
Visits: 25,143
|
|
|
|
|
|
Right there with Babe
      
Group: General Forum Members
Last Login: Today @ 3:57 PM
Points: 721,
Visits: 1,366
|
|
sqldummy79 (11/25/2012) I was wondering if there is a way to "trap" or track the amount of data (in bytes) that a select statement pulls when querying a table or tables. I'm trying to show some co-workers that using "select *" instead of "select <column_name>" is better. The only way I feel I can accomplish this is to show them the difference in the amount of data that they are pulling.
Thanks in advance
Wait - whaaa? How is using "select *" better than using "select <column_name>"?
Selecting every column (i.e., "SELECT *") is going to result in, at a minimum, the retrieval of rows from the leaf level of the clustered index or from the heap (or from the leaf level of a non-clustered index that includes all columns of the table). If the optimizer uses a non-clustered index that is not a covering index in evaluating the WHERE clause or join conditions, there will be a index scan or seek followed by a key lookup, which means that SQL Server must read even more pages to get to the final result set.
On the other hand, "SELECT <column_name, n>" may require SQL Server to read less data than "SELECT *" when a covering index exists that includes fewer than all the columns of the table and the optimizer can use it to satisfy the query. SQL Server will read the data from the leaf level of the covering index, where the rows will be narrower than the clustered index/heap rows. More rows may fit on each page, so SQL Server may be able to satisfy the query by reading fewer pages.
While there may be some use cases that demonstrate the opposite, I feel fairly confident that "SELECT <column_name, . . . n>" queries will generally require fewer page reads than "SELECT *" queries when well-constructed covering indexes are available. In the absence of covering indexes, SELECT <column_name, . . . n>" queries generally should not require SQL Server to read more data than "SELECT *" queries.
If the OP's question pertains to the amount of data returned by SQL Server to the requesting application or client (i.e., the number of bytes SQL Server sends through the network pipeline), then surely "SELECT <column_name, . . . n" will return fewer bytes than "SELECT *" where n < the total number of columns in the table.
I hope that the OP simply mistyped his question and meant to say that "SELECT <column_name>" will be better than "SELECT *". There are many more, and more important, reasons that this is true. If I'm missing something here, please show me where I've gone astray.
|
|
|
|
|
SSCertifiable
       
Group: General Forum Members
Last Login: Today @ 10:09 PM
Points: 5,658,
Visits: 6,100
|
|
wolfkillj (11/26/2012)
I hope that the OP simply mistyped his question and meant to say that "SELECT <column_name>" will be better than "SELECT *". There are many more, and more important, reasons that this is true. If I'm missing something here, please show me where I've gone astray.
You haven't missed anything. SELECT * is never better, both mechanically and for long-term support for eventual schema changes.
- Craig Farrell
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions | Forum Netiquette For index/tuning help, follow these directions. |Tally Tables Twitter: @AnyWayDBA
|
|
|
|
|
SSC-Dedicated
           
Group: General Forum Members
Last Login: Today @ 8:21 PM
Points: 32,893,
Visits: 26,765
|
|
sqldummy79 (11/25/2012) I was wondering if there is a way to "trap" or track the amount of data (in bytes) that a select statement pulls when querying a table or tables. I'm trying to show some co-workers that using "select *" instead of "select <column_name>" is better. The only way I feel I can accomplish this is to show them the difference in the amount of data that they are pulling.
Thanks in advance
SET STATISTICS IO ON is likely what you're looking for.
I believe that your logic is flawed though. Your statement is true only if the idexes on the table are not used even on a small table such as the following...
USE AdventureWorks;
SET STATISTICS IO ON --===== All columns from an indexed table (9 reads) SELECT * FROM HumanResources.Employee;
--===== All columns from an identical but non-indexed table (8 reads) SELECT * FROM #NoIndexes;
--===== 2 columns from an indexed table (5 reads) SELECT EmployeeID, LoginID FROM HumanResources.Employee;
--===== 2 columns from an identical but non-indexed table (8 reads) SELECT EmployeeID, LoginID FROM #NoIndexes; --------------------------------------------------------------------------- --===== All columns from an indexed table (2 reads) SELECT * FROM HumanResources.Employee WHERE EmployeeID = 137;
--===== All columns from an identical but non-indexed table (8 reads) SELECT * FROM #NoIndexes WHERE EmployeeID = 137;
--===== 2 columns from an indexed table (2 reads) SELECT EmployeeID, LoginID FROM HumanResources.Employee WHERE EmployeeID = 137;
--===== 2 columns from an identical but non-indexed table (8 reads) SELECT EmployeeID, LoginID FROM #NoIndexes WHERE EmployeeID = 137;
If you compare what looks like a savings in the number of reads (for the single row criteria examples) with the last arrow to the left of the "Actual Execution Plan", as Gail suggested, you will see that even the single row criteria returned less (estimated, in this case) data and less data is better for the "Pipe".
Unless you're thinking of something else, the code above (along with the "Actual Execution Plan") proves that including only the columns you need is better for the "Pipe" whether you have an index or not and whether you have criteria or not.
--Jeff Moden "RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".
First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."
For better, quicker answers on T-SQL questions, click on the following... http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following... http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
|
|
|
|