JOIN BIG and SMALL TABLEs

  • tony28 (8/26/2014)


    ChrisM@Work (8/26/2014)


    tony28 (8/25/2014)


    I added second sql plan with query with condition before update statistics..

    is this so big different between before and after ?

    here looks almost ok

    Query 2 works very differently from the way it looks. Examination of the execution plan and the code results in this:

    SELECT

    tB.ORDER_DATE,

    tB.COMMIT_NO,

    tA.PRT_ID,

    tA.PRT_POS,

    tA.PRT_TITLE, tA.ITEM_CODE,

    isnull(tB.ALC_CODE,'') ALC_CODE,

    isnull(tC.PRINT_SCRIP,'') PRINT_SCRIP,

    CASE

    WHEN tC.PRINT_SCRIP = '' THEN tB.ALC_CODE

    WHEN tC.PRINT_SCRIP <> '' THEN tC.PRINT_SCRIP

    ELSE tB.ALC_CODE END ALC_RESULT

    FROM [dbo].[TB_WORK_ORDERDTL_CPM01] tb

    inner JOIN TB_CODE_ALC tC -- INNER JOIN to [TB_CODE_PRTFORM] tA turns this to an INNER JOIN

    ON tC.ITEM_CODE = tB.ITEM_CODE

    AND tC.ALC_CODE= tB.ALC_CODE

    AND tC.CAR_CODE = CASE

    WHEN tb.BODY_NO like 'A6%' THEN 'GDe'

    WHEN tb.BODY_NO like'IT%' THEN 'EL'

    WHEN tb.BODY_NO like 'JC%' THEN 'JC'

    ELSE NULL END -- tB.CAR_CODE

    inner JOIN [TB_CODE_PRTFORM] ta -- changed to INNER JOIN

    ON tC.CAR_CODE= tA.CAR_CODE

    AND tC.ITEM_CODE = tA.ITEM_CODE

    --and tA.PRT_ID='CPM-PRT-01' REPEATED IN where CLAUSE

    WHERE tb.ORDER_DATE = '20140822'

    AND tA.CAR_CODE IN ('EL','JC','GDe')

    -- AND SUBSTRING(ta.PRT_ID,1,3) = SUBSTRING('CPM01',1,3) -- unnecessary: ta.PRT_ID='CPM-PRT-01'

    AND (tA.PRT_POS BETWEEN 1 AND 60) -- turns RIGHT JOIN [TB_CODE_PRTFORM] tA into an INNER JOIN

    aND ta.PRT_ID='CPM-PRT-01'

    ORDER BY tb.ORDER_DATE, tb.COMMIT_NO, BODY_NO, ta.ITEM_CODE

    Test it: if the results are the same as your original query 2, then use it.

    Looks good, after write this topics I little modify - remove string operation and etc. I dont use Inner join, but in few things was very better than Left or Right join, so I have to learn to use normally

    IO is almost same, and Ex plan looks more better, according to subthree cost, but memory grant is worst.

    and time

    SQL Server Execution Times:

    CPU time = 1595 ms, elapsed time = 1427 ms.

    SQL Server Execution Times:

    CPU time = 734 ms, elapsed time = 1577 ms.

    Not updated stats 20140626

    Your

    (45389 row(s) affected)

    Table 'TB_CODE_ALC'. Scan count 0, logical reads 90778, physical

    Table 'TB_CODE_PRTFORM'. Scan count 88673, logical reads 256594,

    Table 'TB_WORK_ORDERDTL_CPM01'. Scan count 1, logical reads 750,

    My

    (45389 row(s) affected)

    Table 'TB_CODE_ALC'. Scan count 0, logical reads 90778, physical

    Table 'TB_CODE_PRTFORM'. Scan count 88673, logical reads 256594,

    Table 'TB_WORK_ORDERDTL_CPM01'. Scan count 1, logical reads 750,

    Updated stats 20140622

    Your(62740 row(s) affected)

    Table 'TB_CODE_PRTFORM'. Scan count 3, logical reads 11,

    Table 'TB_CODE_ALC'. Scan count 9, logical reads 568,

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0

    Table 'TB_WORK_ORDERDTL_CPM01'. Scan count 9, logical reads 1132,

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0

    My

    (62740 row(s) affected)

    Table 'Worktable'. Scan count 4, logical reads 33, physical reads

    Table 'TB_WORK_ORDERDTL_CPM01'. Scan count 1, logical reads 995,

    Table 'TB_CODE_PRTFORM'. Scan count 1, logical reads 16, physical

    Table 'TB_CODE_ALC'. Scan count 1, logical reads 155, physical rea

    UPDATE STATISTICS isn't the same as "include a filter on COMMIT_NO in the WHERE clause". Read about UPDATE STATISTICS here and if necessary ask your DBA to do it for you. SQL Server will generate good plans when statistics are up to date, and may generate bad plans when they are not. There are indications in your execution plans that your statistics are not up to date.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (8/27/2014)


    tony28 (8/26/2014)


    ChrisM@Work (8/26/2014)


    tony28 (8/25/2014)


    I added second sql plan with query with condition before update statistics..

    is this so big different between before and after ?

    here looks almost ok

    Query 2 works very differently from the way it looks. Examination of the execution plan and the code results in this:

    SELECT

    tB.ORDER_DATE,

    tB.COMMIT_NO,

    tA.PRT_ID,

    tA.PRT_POS,

    tA.PRT_TITLE, tA.ITEM_CODE,

    isnull(tB.ALC_CODE,'') ALC_CODE,

    isnull(tC.PRINT_SCRIP,'') PRINT_SCRIP,

    CASE

    WHEN tC.PRINT_SCRIP = '' THEN tB.ALC_CODE

    WHEN tC.PRINT_SCRIP <> '' THEN tC.PRINT_SCRIP

    ELSE tB.ALC_CODE END ALC_RESULT

    FROM [dbo].[TB_WORK_ORDERDTL_CPM01] tb

    inner JOIN TB_CODE_ALC tC -- INNER JOIN to [TB_CODE_PRTFORM] tA turns this to an INNER JOIN

    ON tC.ITEM_CODE = tB.ITEM_CODE

    AND tC.ALC_CODE= tB.ALC_CODE

    AND tC.CAR_CODE = CASE

    WHEN tb.BODY_NO like 'A6%' THEN 'GDe'

    WHEN tb.BODY_NO like'IT%' THEN 'EL'

    WHEN tb.BODY_NO like 'JC%' THEN 'JC'

    ELSE NULL END -- tB.CAR_CODE

    inner JOIN [TB_CODE_PRTFORM] ta -- changed to INNER JOIN

    ON tC.CAR_CODE= tA.CAR_CODE

    AND tC.ITEM_CODE = tA.ITEM_CODE

    --and tA.PRT_ID='CPM-PRT-01' REPEATED IN where CLAUSE

    WHERE tb.ORDER_DATE = '20140822'

    AND tA.CAR_CODE IN ('EL','JC','GDe')

    -- AND SUBSTRING(ta.PRT_ID,1,3) = SUBSTRING('CPM01',1,3) -- unnecessary: ta.PRT_ID='CPM-PRT-01'

    AND (tA.PRT_POS BETWEEN 1 AND 60) -- turns RIGHT JOIN [TB_CODE_PRTFORM] tA into an INNER JOIN

    aND ta.PRT_ID='CPM-PRT-01'

    ORDER BY tb.ORDER_DATE, tb.COMMIT_NO, BODY_NO, ta.ITEM_CODE

    Test it: if the results are the same as your original query 2, then use it.

    Looks good, after write this topics I little modify - remove string operation and etc. I dont use Inner join, but in few things was very better than Left or Right join, so I have to learn to use normally

    IO is almost same, and Ex plan looks more better, according to subthree cost, but memory grant is worst.

    and time

    SQL Server Execution Times:

    CPU time = 1595 ms, elapsed time = 1427 ms.

    SQL Server Execution Times:

    CPU time = 734 ms, elapsed time = 1577 ms.

    Not updated stats 20140626

    Your

    (45389 row(s) affected)

    Table 'TB_CODE_ALC'. Scan count 0, logical reads 90778, physical

    Table 'TB_CODE_PRTFORM'. Scan count 88673, logical reads 256594,

    Table 'TB_WORK_ORDERDTL_CPM01'. Scan count 1, logical reads 750,

    My

    (45389 row(s) affected)

    Table 'TB_CODE_ALC'. Scan count 0, logical reads 90778, physical

    Table 'TB_CODE_PRTFORM'. Scan count 88673, logical reads 256594,

    Table 'TB_WORK_ORDERDTL_CPM01'. Scan count 1, logical reads 750,

    Updated stats 20140622

    Your(62740 row(s) affected)

    Table 'TB_CODE_PRTFORM'. Scan count 3, logical reads 11,

    Table 'TB_CODE_ALC'. Scan count 9, logical reads 568,

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0

    Table 'TB_WORK_ORDERDTL_CPM01'. Scan count 9, logical reads 1132,

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0

    My

    (62740 row(s) affected)

    Table 'Worktable'. Scan count 4, logical reads 33, physical reads

    Table 'TB_WORK_ORDERDTL_CPM01'. Scan count 1, logical reads 995,

    Table 'TB_CODE_PRTFORM'. Scan count 1, logical reads 16, physical

    Table 'TB_CODE_ALC'. Scan count 1, logical reads 155, physical rea

    UPDATE STATISTICS isn't the same as "include a filter on COMMIT_NO in the WHERE clause". Read about UPDATE STATISTICS here and if necessary ask your DBA to do it for you. SQL Server will generate good plans when statistics are up to date, and may generate bad plans when they are not. There are indications in your execution plans that your statistics are not up to date.

    Hello, thank you , I had vacation so sorry for delay...

    I know about update statistics and how it function, but I told that we have update statistics at Saturday 15:00. So if it possible get good result without update ... we works 24/5 and sometimes during break. I dont know when I can update statistics except weekend.....

  • Hello guys, I disscused with one DB specialist and I found solution for my problem...

    because with data in statistics was very good results, and this use execution plan with HASH join.. so I directly write to join HASH and result is very very better than before and all data are very fast.

    select tB.ORDER_DATE, tB.COMMIT_NO,

    CASE WHEN tC.PRINT_SCRIP='' THEN tB.ALC_CODE

    WHEN tC.PRINT_SCRIP<>'' THEN tC.PRINT_SCRIP

    ELSE tB.ALC_CODE

    END ALC_CODE

    FROM [TB_CODE_PRTFORM] tA

    RIGHT hash JOIN TB_CODE_ALC tC

    ON tC.CAR_CODE= tA.CAR_CODE

    --and tA.PRT_ID='CPM-PRT-01'

    AND tC.ITEM_CODE = tA.ITEM_CODE

    LEFT HASH JOIN ( SELECT *,CASE WHEN BODY_NO like 'A6%' THEN 'GDe'

    WHEN BODY_NO like'IT%' THEN 'EL'

    WHEN BODY_NO like 'JC%' THEN 'JC'

    END CAR_CODE FROM [dbo].[TB_WORK_ORDERDTL_CPM01] --with(nolock)

    WHERE ORDER_DATE between '20140910' and '20140910'

    AND COMMIT_NO between '0001' and '1330'

    ) tB

    ON tC.ITEM_CODE = tB.ITEM_CODE

    AND tC.CAR_CODE = tB.CAR_CODE

    AND tC.ALC_CODE= tB.ALC_CODE

    WHERE ORDER_DATE IS NOT NULL AND tA.CAR_CODE IN ('EL','JC','GDe') aND PRT_ID='CPM-PRT-01'

    ORDER BY ORDER_DATE, COMMIT_NO, BODY_NO OPTION (MAXDOP 1)

  • result with data in statistics are very good so I will not write here..

    but result with data which aren in statistics in this time were very poor, after optimalization are very very better

    you can check

    Ex. plan maybe shows 98% vs 2%, subtree cost 0.474001 vs 0.0099004, memory grant 2272 vs 0

    but IO operation are finally how I wanted.

    (43759 row(s) affected)

    Table 'Worktable'. Scan count 0, logical reads 0,

    Table 'TB_WORK_ORDERDTL_CPM01'. Scan count 1, logical reads 741

    Table 'TB_CODE_ALC'. Scan count 130, logical reads 392, -- If I use HASH join also on first join , result is 1 and 197

    Table 'TB_CODE_PRTFORM'. Scan count 3, logical reads 11,

    SQL Server Execution Times:

    CPU time = 469 ms, elapsed time = 1084 ms.

    vs

    (43759 row(s) affected)

    Table 'TB_CODE_ALC'. Scan count 0, logical reads 90746,

    Table 'TB_CODE_PRTFORM'. Scan count 88586, logical reads 254448,

    Table 'TB_WORK_ORDERDTL_CPM01'. Scan count 1, logical reads 742,

    SQL Server Execution Times:

    CPU time = 3484 ms, elapsed time = 4591 ms.

    So what do you think about this solution ?

  • tony28 (9/10/2014)


    Hello guys, I disscused with one DB specialist and I found solution for my problem...

    because with data in statistics was very good results,

    ...

    tony28 (9/10/2014)


    ...result with data in statistics are very good so I will not write here..

    but result with data which aren in statistics ...

    Can you explain what you mean by "data in statistics" and "data which are not in statistics"? I'm sure I'm not alone in being confused by this.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Referring to your original post in this thread:

    tony28 (8/25/2014)


    I have one small table about 150rows and second bigger about 50000rows

    Three tables are referenced in your query:

    Table [TB_WORK_ORDERDTL_CPM01] 10,856,500 rows

    Table [TB_CODE_PRTFORM] 1,165 rows

    Table [TB_CODE_ALC] 31,234 rows

    If I join together and I selected just one item from bigger table I have this

    (51 row(s) affected)

    Table 'TB_CODE_ALC'. Scan count 0, logical reads 102, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'TB_CODE_PRTFORM'. Scan count 101, logical reads 303, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'TB_WORK_ORDERDTL_CPM01'. Scan count 1, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    If I selected all what I need I have

    (53624 row(s) affected)

    Table 'TB_CODE_ALC'. Scan count 0, logical reads 107248, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'TB_CODE_PRTFORM'. Scan count 104750, logical reads 302940, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'TB_WORK_ORDERDTL_CPM01'. Scan count 1, logical reads 879, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Why is possible that data from small data are still load to memory ?

    Both queries require all three tables to be read.

    Do you have any strategy for these joins ?

    Yes. Change them to inner joins in the query. The logic of the query dictates that they should be inner joins as explained earlier, and as shown by the execution plans.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 6 posts - 16 through 21 (of 21 total)

You must be logged in to reply to this topic. Login to reply