RE: Amount of data – SQLServerCentral

SSCrazy Eights

Points: 9832

November 26, 2012 at 3:13 pm

sqldummy79 (11/25/2012)
I was wondering if there is a way to "trap" or track the amount of data (in bytes) that a select statement pulls when querying a table or tables. I'm trying to show some co-workers that using "select *" instead of "select <column_name>" is better. The only way I feel I can accomplish this is to show them the difference in the amount of data that they are pulling.
Thanks in advance

Wait - whaaa? How is using "select *" better than using "select <column_name>"?

Selecting every column (i.e., "SELECT *") is going to result in, at a minimum, the retrieval of rows from the leaf level of the clustered index or from the heap (or from the leaf level of a non-clustered index that includes all columns of the table). If the optimizer uses a non-clustered index that is not a covering index in evaluating the WHERE clause or join conditions, there will be a index scan or seek followed by a key lookup, which means that SQL Server must read even more pages to get to the final result set.

On the other hand, "SELECT <column_name, n>" may require SQL Server to read less data than "SELECT *" when a covering index exists that includes fewer than all the columns of the table and the optimizer can use it to satisfy the query. SQL Server will read the data from the leaf level of the covering index, where the rows will be narrower than the clustered index/heap rows. More rows may fit on each page, so SQL Server may be able to satisfy the query by reading fewer pages.

While there may be some use cases that demonstrate the opposite, I feel fairly confident that "SELECT <column_name, . . . n>" queries will generally require fewer page reads than "SELECT *" queries when well-constructed covering indexes are available. In the absence of covering indexes, SELECT <column_name, . . . n>" queries generally should not require SQL Server to read more data than "SELECT *" queries.

If the OP's question pertains to the amount of data returned by SQL Server to the requesting application or client (i.e., the number of bytes SQL Server sends through the network pipeline), then surely "SELECT <column_name, . . . n" will return fewer bytes than "SELECT *" where n < the total number of columns in the table.

I hope that the OP simply mistyped his question and meant to say that "SELECT <column_name>" will be better than "SELECT *". There are many more, and more important, reasons that this is true. If I'm missing something here, please show me where I've gone astray. 🙂

Jason Wolfkill