How is SQL query processed in this example?

  • Hi,

    I have already posted the problem on Stackoverflow. Can you please help?

    http://stackoverflow.com/questions/15505165/how-is-sql-query-processed-in-this-example

    Thanks much in advance!

    Regards,

    Nayan

  • Not sure what your problem is there. Looks to me like what you need to do is run some tests to determine 1) You get the results you expect, 2) which one is more effecient if both return the same results.

  • My question is the latter one - which is more efficient.

  • cnayan (3/19/2013)


    My question is the latter one - which is more efficient.

    So do as Lynn suggested IF they both return correct results.

    Looks to me like what you need to do is run some tests to determine 1) You get the results you expect, 2) which one is more effecient if both return the same results.

    I suspect they will both produce the same execution plan. That means that neither one of them is going to be any different. However the first one is FAR easier to read and understand.

    By far the best way to get answers about which of two approaches performs better is to test them on your system!!!

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • cnayan (3/19/2013)


    My question is the latter one - which is more efficient.

    I tested this a little and here's what I determined: Both queries you posted will produce the exact same query plan and therefore are equally efficient. Taking the sample data provided (and not knowing if either id represented a clustered index), the following queries will most likely produce the exact same query plan (regardless of there are any clustered indexes present):

    IF OBJECT_ID('tempdb..#names') IS NOT NULL DROP TABLE #names;

    IF OBJECT_ID('tempdb..#TableB') IS NOT NULL DROP TABLE #TableB;

    CREATE TABLE #names (id int, name char(4));

    CREATE TABLE #TableB (id int, ChildCount int);

    INSERT INTO #names SELECT 1,'A100' UNION ALL SELECT 2,'A100' UNION ALL SELECT 3,'A100' UNION ALL SELECT 4,'A200';

    INSERT INTO #TableB SELECT 1,10 UNION ALL SELECT 2,25 UNION ALL SELECT 3,10 UNION ALL SELECT 4,70;

    select n.[ID]

    from #Names n

    inner join #TableB b

    on b.[ID] = n.[ID]

    and b.[ChildCount] > 50

    where n.[Name] Like 'A%'

    select n.[ID]

    from #Names n

    inner join #TableB b

    on b.[ID] = n.[ID]

    and b.[ChildCount] > 50

    and n.[Name] Like 'A%'

    select n.[ID]

    from #Names n

    inner join #TableB b

    on b.[ID] = n.[ID]

    and LEFT(n.[Name],1)='A'

    and b.[ChildCount] > 50

    select n.[ID]

    from #Names n

    inner join #TableB b

    on b.[ID] = n.[ID]

    and n.[Name] Like 'A%'

    and b.[ChildCount] > 50

    select n.[ID]

    from #Names n

    inner JOIN #TableB b ON n.id=b.id

    WHERE LEFT(name,1)='A'

    AND ChildCount>50

    ;WITH filtered_names AS

    (select [ID]

    from #Names WHERE [Name] Like 'A%')

    SELECT n.id

    FROM #TableB b

    inner join filtered_names n ON n.id=b.id

    WHERE b.ChildCount>50

    select n.[ID]

    from #Names n

    CROSS JOIN #TableB b

    WHERE b.[ID] = n.[ID]

    and b.[ChildCount] > 50

    and n.[Name] Like 'A%'

    select n.[ID]

    from #Names n

    CROSS APPLY #TableB b

    WHERE b.[ID] = n.[ID]

    and b.[ChildCount] > 50

    and n.[Name] Like 'A%'

    If you don't have a clustered index on either table then each of the above queries will be equally slow and the queries below will produce the best plan. This is because the OUTER joins will produce plans with nested loops whereas the queries above will still generate a hash match:

    select n.[ID]

    from #Names n

    FULL JOIN #TableB b ON n.id=b.id

    WHERE LEFT(name,1)='A'

    AND ChildCount>50

    select n.[ID]

    from #Names n

    RIGHT JOIN #TableB b ON n.id=b.id

    WHERE LEFT(name,1)='A'

    AND ChildCount>50

    select n.[ID]

    from #TableB b

    LEFT JOIN #Names n ON n.id=b.id

    WHERE LEFT(name,1)='A'

    AND ChildCount>50

    Edit: typo & bonus code

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Thanks Alan.B for testing it extensively! 🙂

    When I ran your queries, like:

    select n.*

    from Names n

    FULL JOIN TableB b ON n._id=b._id

    WHERE LEFT(name,1)='A'

    AND ChildCount>50

    it showed Nested Loop as you suggested.

    But, I ran my 2 queries, and the Actual EPs also showed Nested Loop for Inner Join.

    Any idea why you suspected otherwise?

    This is because the OUTER joins will produce plans with nested loops whereas the queries above will still generate a hash match

    Alan.B (3/19/2013)


    cnayan (3/19/2013)


    My question is the latter one - which is more efficient.

    I tested this a little and here's what I determined: Both queries you posted will produce the exact same query plan and therefore are equally efficient. Taking the sample data provided (and not knowing if either id represented a clustered index), the following queries will most likely produce the exact same query plan (regardless of there are any clustered indexes present):

    IF OBJECT_ID('tempdb..#names') IS NOT NULL DROP TABLE #names;

    IF OBJECT_ID('tempdb..#TableB') IS NOT NULL DROP TABLE #TableB;

    CREATE TABLE #names (id int, name char(4));

    CREATE TABLE #TableB (id int, ChildCount int);

    INSERT INTO #names SELECT 1,'A100' UNION ALL SELECT 2,'A100' UNION ALL SELECT 3,'A100' UNION ALL SELECT 4,'A200';

    INSERT INTO #TableB SELECT 1,10 UNION ALL SELECT 2,25 UNION ALL SELECT 3,10 UNION ALL SELECT 4,70;

    select n.[ID]

    from #Names n

    inner join #TableB b

    on b.[ID] = n.[ID]

    and b.[ChildCount] > 50

    where n.[Name] Like 'A%'

    select n.[ID]

    from #Names n

    inner join #TableB b

    on b.[ID] = n.[ID]

    and b.[ChildCount] > 50

    and n.[Name] Like 'A%'

    select n.[ID]

    from #Names n

    inner join #TableB b

    on b.[ID] = n.[ID]

    and LEFT(n.[Name],1)='A'

    and b.[ChildCount] > 50

    select n.[ID]

    from #Names n

    inner join #TableB b

    on b.[ID] = n.[ID]

    and n.[Name] Like 'A%'

    and b.[ChildCount] > 50

    select n.[ID]

    from #Names n

    inner JOIN #TableB b ON n.id=b.id

    WHERE LEFT(name,1)='A'

    AND ChildCount>50

    ;WITH filtered_names AS

    (select [ID]

    from #Names WHERE [Name] Like 'A%')

    SELECT n.id

    FROM #TableB b

    inner join filtered_names n ON n.id=b.id

    WHERE b.ChildCount>50

    select n.[ID]

    from #Names n

    CROSS JOIN #TableB b

    WHERE b.[ID] = n.[ID]

    and b.[ChildCount] > 50

    and n.[Name] Like 'A%'

    select n.[ID]

    from #Names n

    CROSS APPLY #TableB b

    WHERE b.[ID] = n.[ID]

    and b.[ChildCount] > 50

    and n.[Name] Like 'A%'

    If you don't have a clustered index on either table then each of the above queries will be equally slow and the queries below will produce the best plan. This is because the OUTER joins will produce plans with nested loops whereas the queries above will still generate a hash match:

    select n.[ID]

    from #Names n

    FULL JOIN #TableB b ON n.id=b.id

    WHERE LEFT(name,1)='A'

    AND ChildCount>50

    select n.[ID]

    from #Names n

    RIGHT JOIN #TableB b ON n.id=b.id

    WHERE LEFT(name,1)='A'

    AND ChildCount>50

    select n.[ID]

    from #TableB b

    LEFT JOIN #Names n ON n.id=b.id

    WHERE LEFT(name,1)='A'

    AND ChildCount>50

    Edit: typo & bonus code

  • cnayan (3/19/2013)


    Thanks Alan.B for testing it extensively! 🙂

    When I ran your queries, like:

    select n.*

    from Names n

    FULL JOIN TableB b ON n._id=b._id

    WHERE LEFT(name,1)='A'

    AND ChildCount>50

    it showed Nested Loop as you suggested.

    But, I ran my 2 queries, and the Actual EPs also showed Nested Loop for Inner Join.

    Any idea why you suspected otherwise?

    If you have some sample data and DDL I can tell you what I think. I don't know if you have a clustered index on either table. If there was a clustered index or unique index you will likely see nested loops for the inner join queries.

    Let's take the following three examples. In the first I am performing three queries against two heaps (no clustered index is present)...

    Note my comments

    IF OBJECT_ID('tempdb..#names') IS NOT NULL DROP TABLE #names;

    IF OBJECT_ID('tempdb..#TableB') IS NOT NULL DROP TABLE #TableB;

    CREATE TABLE #names (id int, name char(4));

    CREATE TABLE #TableB (id int, ChildCount int);

    INSERT INTO #names SELECT 1,'A100' UNION ALL SELECT 2,'A100' UNION ALL SELECT 3,'A100' UNION ALL SELECT 4,'A200';

    INSERT INTO #TableB SELECT 1,10 UNION ALL SELECT 2,25 UNION ALL SELECT 3,10 UNION ALL SELECT 4,70;

    -- all the work is done durning the hash match

    select n.id

    from #Names n

    INNER JOIN #TableB b

    onb.id = n.id

    and b.ChildCount > 50

    where n.Name Like 'A%'

    -- all the work is done durning the hash match

    select n.id

    from #Names n

    INNER JOIN #TableB b

    onb.[ID] = n.[ID]

    and b.ChildCount > 50

    and LEFT(n.Name,1)='A'

    --all the work is done during the table scan

    select n.[ID]

    from #Names n

    RIGHT JOIN #TableB b ON n.id=b.id

    WHERE LEFT(n.name,1)='A'

    AND b.ChildCount>50

    Now lets add a clustered index...

    Again, note my comments...

    IF OBJECT_ID('tempdb..#names') IS NOT NULL DROP TABLE #names;

    IF OBJECT_ID('tempdb..#TableB') IS NOT NULL DROP TABLE #TableB;

    CREATE TABLE #names (id int, name char(4));

    CREATE TABLE #TableB (id int, ChildCount int);

    CREATE CLUSTERED INDEX idx_n_id ON #names(id);

    CREATE CLUSTERED INDEX idx_tb_id ON #TableB(id);

    INSERT INTO #names SELECT 1,'A100' UNION ALL SELECT 2,'A100' UNION ALL SELECT 3,'A100' UNION ALL SELECT 4,'A200';

    INSERT INTO #TableB SELECT 1,10 UNION ALL SELECT 2,25 UNION ALL SELECT 3,10 UNION ALL SELECT 4,70;

    --all the work is done during the clustered index scan

    select n.id

    from #Names n

    INNER JOIN #TableB b

    onb.id = n.id

    and b.ChildCount > 50

    where n.Name Like 'A%'

    --all the work is done during the clustered index scan

    select n.id

    from #Names n

    INNER JOIN #TableB b

    onb.[ID] = n.[ID]

    and b.ChildCount > 50

    and LEFT(n.Name,1)='A'

    --all the work is done during the clustered index scan

    select n.[ID]

    from #Names n

    RIGHT JOIN #TableB b ON n.id=b.id

    WHERE LEFT(n.name,1)='A'

    AND b.ChildCount>50

    GO

    Now, if there is a unique constraint but no clustered index then you will see nested loops for all three queries but the RIGHT join will still perform the best.

    IF OBJECT_ID('tempdb..#names') IS NOT NULL DROP TABLE #names;

    IF OBJECT_ID('tempdb..#TableB') IS NOT NULL DROP TABLE #TableB;

    CREATE TABLE #names (id int unique, name char(4));

    CREATE TABLE #TableB (id int unique, ChildCount int);

    INSERT INTO #names SELECT 1,'A100' UNION ALL SELECT 2,'A100' UNION ALL SELECT 3,'A100' UNION ALL SELECT 4,'A200';

    INSERT INTO #TableB SELECT 1,10 UNION ALL SELECT 2,25 UNION ALL SELECT 3,10 UNION ALL SELECT 4,70;

    -- most of the work is done during the index seek and table scans, the rest by an RID lookup

    select n.id

    from #Names n

    INNER JOIN #TableB b

    onb.id = n.id

    and b.ChildCount > 50

    where n.Name Like 'A%'

    -- most of the work is done during the index seek and table scans, the rest by an RID lookup

    select n.id

    from #Names n

    INNER JOIN #TableB b

    onb.[ID] = n.[ID]

    and b.ChildCount > 50

    and LEFT(n.Name,1)='A'

    --all the work is done during the table scan

    select n.[ID]

    from #Names n

    RIGHT JOIN #TableB b ON n.id=b.id

    WHERE LEFT(n.name,1)='A'

    AND b.ChildCount>50

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Alan.B (3/19/2013)


    If you have some sample data and DDL I can tell you what I think. I don't know if you have a clustered index on either table. If there was a clustered index or unique index you will likely see nested loops for the inner join queries.

    You are right. I should have given the table structures. Here it is:

    CREATE TABLE [dbo].[Names](

    [_id] [int] IDENTITY(1,1) NOT NULL,

    [Name] [varchar](50) NOT NULL

    ) ON [PRIMARY]

    CREATE TABLE [dbo].[TableB](

    [_id] [int] IDENTITY(1,1) NOT NULL,

    [ChildCount] [int] NULL

    ) ON [PRIMARY]

    I ran both these queries:

    select n.[_id]

    from Names n

    inner join TableB b

    on b.[_id] = n.[_id]

    and b.[ChildCount] > 50

    where n.[Name] Like 'A%'

    select n.[_id]

    from Names n

    inner join TableB b

    on b.[_id] = n.[_id]

    and b.[ChildCount] > 50

    and n.[Name] Like 'A%'

    And you were right! In both cases, hash match was used for the merger.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply