Putts (2/20/2009)
Perhaps I have always just run into the perfect scenarios for using distinct and now have crawled into others where it is not as useful.Are there certain cases where distinct have always shown to increase performance?
Asking around fellow developers where I work - none of them have ever seen Distincts take longer than natural SELECTs so maybe this is something that only affects more advanced T-SQL queries where you folks see it on a daily basis and the regular "joe schmoe" SQL query writer does not.
I think Gail might have it, it's moving fewer rows, so that's saving processing time. Let's take an example where identical number of rows are returned. You can run both these on AdventureWorks
SELECT sod.SalesOrderID,sod.ProductID FROM
Sales.SalesOrderDetail AS sod
WHERE sod.SalesOrderId = 43666
SELECT DISTINCT sod.SalesOrderID,sod.ProductID FROM
Sales.SalesOrderDetail AS sod
WHERE sod.SalesOrderID = 43666
And if you look at the STATISTICS IO & TIME:
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
(6 row(s) affected)
Table 'SalesOrderDetail'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
(6 row(s) affected)
Table 'SalesOrderDetail'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 15 ms.
SQL Server parse and compile time:
And I uploaded a screen shot of the execution plans (sorry, being a bit lazy). You can see that while the exact same data was returned, more work was done by the second query. The estimated costs were 18% & 82% of the over all.
DISTINCT causes performance issues. It just does.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning