• Hunterwood (2/16/2010)


    It´s allways a good rule of thumb to use TOP 1 together with EXISTS, because it prevents the database engine from doing unneccesary work.

    Got an example that proves that?

    I you write:

    select *

    from OneTable

    where EXISTS (select * from SecondTable where OneCol = AnotherCol)

    The database engine will first get ALL rows in the SecondTable, then it will check if the result has any rows.

    No it will not.

    Using an example in my testing DB... (250000 rows in BigTable, 3819 in SmallerTable)

    SELECT * FROM SmallerTable --(ALL the rows in SmallerTable)

    Table 'SmallerTable'. Scan count 1, logical reads 14, physical reads 0

    Hence I can conclude that a read of ALL the rows in SmallerTable does 14 logical reads

    SELECT *

    FROM dbo.BigTable

    WHERE EXISTS (SELECT * FROM dbo.SmallerTable WHERE dbo.BigTable.SomeColumn = dbo.SmallerTable.LookupColumn)

    Table 'BigTable'. Scan count 1, logical reads 3639, physical reads 0

    Table 'SmallerTable'. Scan count 1, logical reads 9, physical reads 0

    Only 9 logical reads this time, so it has not read the entire of SmallerTable

    If you instead specify TOP 1:

    select *

    from OneTable

    where EXISTS (select TOP 1 * from SecondTable where OneCol = AnotherCol)

    The database engine will only fetch ONE row from the SecondTable, then it will check if the result has any rows.

    SELECT *

    FROM dbo.BigTable

    WHERE EXISTS (SELECT TOP(1) * FROM dbo.SmallerTable WHERE dbo.BigTable.SomeColumn = dbo.SmallerTable.LookupColumn)

    Table 'BigTable'. Scan count 1, logical reads 3639, physical reads 0.

    Table 'SmallerTable'. Scan count 1, logical reads 9, physical reads 0

    Identical number of logical reads to the case where I didn't have the TOP 1.

    The same thing is applicable for TSQL statements like:

    IF ((select count(*) from myTable) > 0)

    ...

    If you change this to:

    IF (EXISTS (select top 1 * from myTable))

    The database engine doesn´t have to count the rows, when you aren´t even interrested in how many there are...

    The optimiser is smart enough to realise that when you say IF ((select count(*) from myTable) > 0) it doesn't have to count the rows anyway.

    SELECT COUNT(*) FROM dbo.BigTable

    Table 'BigTable'. Scan count 1, logical reads 438, physical reads 0

    Hence we can conclude that counting all the rows in BigTable requires 438 logical reads

    IF (SELECT COUNT(*) FROM dbo.BigTable) > 0

    PRINT 'Rows!'

    Table 'BigTable'. Scan count 1, logical reads 3, physical reads 0

    With only 3 logical reads, that did not count all the rows in the table

    IF EXISTS (SELECT * FROM dbo.BigTable)

    PRINT 'Rows!'

    Table 'BigTable'. Scan count 1, logical reads 3, physical reads 0

    Exactly the same number of reads as for the IF... COUNT

    Now, if you count the rows, assign that to a variable and then check if the variable is > 0, there is a performance difference as, in that case, SQL does have to count the rows as it can't tell what else you may do with the variable later.

    DECLARE @i INT

    SELECT @i = COUNT(*) FROM dbo.BigTable

    IF @i > 0

    PRINT 'Rows!'

    This construct does count all the rows in the table, and the IO stats proves that

    Table 'BigTable'. Scan count 1, logical reads 438, physical reads 0

    If the databases are small the performance difference might not be so huge, but it´s allways a good habbit. 🙂

    A good habit is to test alternatives and see exactly what the performance difference really is, if there's one in the first place.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass