Any tips on speeding this SQL up would be most helpful

  • SET STATISTICS TIME ON

    Ok. Can you try doing this ^ before running the query and check the Messages tab for details?

    Or run the following query, add text from the query, may be an alias used, in '%%'

    SELECT eqs.last_elapsed_time, eqs.total_elapsed_time / eqs.execution_count

    FROM sys.dm_exec_query_stats eqs

    OUTER APPLY sys.dm_exec_sql_text(eqs.sql_handle) est

    WHERE est.text LIKE '%%'

    https://sqlroadie.com/

  • Most expensive operation in my query is for sorting, which we cannot avoid I think. Can you please do the following and check if there is any improvement?

    1. Firstly, do this. Looks like statistics are stale.

    UPDATE STATISTICS dbo.ub_charge

    2. Can you please check what the current Max DOP setting is, is it 4?

    EXEC sp_configure 'show advanced options', 1;

    GO

    RECONFIGURE;

    GO

    EXEC sp_configure 'max degree of parallelism'

    3. Also, please run this and post the output.

    SELECT COUNT(1) FROM sys.dm_os_schedulers WHERE status = 'visible online'

    4. Add the following at end of the query and run it query again.

    (OPTION MAXDOP 1)

    If this takes longer, we can experiment with different MAXDOP values and find the optimum.

    https://sqlroadie.com/

  • This is what was returned with "SET STATISTICS TIME ON" for your query.

    (2570 row(s) affected)

    SQL Server Execution Times:

    CPU time = 3185 ms, elapsed time = 2322 ms.

    Thanks for all the help by the way, I have some experience with SQL, but not with SQL admin, but I have been thrown into it. I am learning!

  • Okay, I updated the statistics and then I ran the next bit of SQL and this is the output:

    Configuration option 'show advanced options' changed from 1 to 1. Run the RECONFIGURE statement to install.

    name minimum maximum config_value run_value

    ----------------------------------- ----------- ----------- ------------ -----------

    max degree of parallelism 0 32767 0 0

    I then ran this:

    SELECT COUNT(1) FROM sys.dm_os_schedulers WHERE status = 'visible online'

    and it returned: 4

    Then I added this to the end my query and ran it again and this is the execution time:

    OPTION (MAXDOP 1)

    SQL Server Execution Times:

    CPU time = 2484 ms, elapsed time = 5025 ms.

    I attached the SQLplan again in case it was needed.

  • Thanks for doing all of that 🙂

    Let me explain. The server has 4 processors. MAX DOP is set at zero, meaning optimizer will use as many number of processors to execute the query as it wishes. It was using all 4 of them.

    Previously, CPU time = 3185 ms, elapsed time = 2322 ms.

    This meant query was executed in 2322 ms. 4 processors spent a cumulative 3.185 s on the query.

    When you specified MAXDOP 1, only 1 processor worked on the query and it took longer for query to complete. I hope this makes sense.

    CPU time = 2484 ms, elapsed time = 5025 ms

    In short, you are better off with the original query and not specifying a MAXDOP, but if you are curious try OPTION (MAXDOP 2) and check execution times.

    I think without adding covering indexes, this is the fastest it will get. Now that statistics are refreshed, can you please run the query again, without the MAXDOP hint and post new plan?

    https://sqlroadie.com/

  • Thank you for the explanation. Here is the latest execution time:

    (2570 row(s) affected)

    (1 row(s) affected)

    SQL Server Execution Times:

    CPU time = 3593 ms, elapsed time = 2785 ms.

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 0 ms.

    And attached is the .sqlplan

    You mentioned you might know why your query had more records, are you feeling like it is returning more and it is correct and mine was not returning some?

  • Thanks for that mate. I think your query returns less rows because of this condition.

    AND (

    SELECT SUM(csub.charge_qty)

    FROM ub_charge_test csub

    WHERE csub.charge_code = c.Charge_code AND

    csub.ub_master_id = m.ub_master_id

    ) > 0

    Now, let us create an index and check how that affects the query.

    Index 1: [ub_master_id] INCLUDE [charge_amount], [charge_code], [charge_qty]

    Can you please run query after creating this, and post plan only if time is any better? My guess is that this will make the query a little faster.

    Index 2: [a_site_id] INCLUDE [charge_amount], [charge_code], [charge_date], [charge_description], [charge_qty], [CPTCode], [CPTCode_Description]

    I do not really recommend this index as it is expensive to maintain, but you could try adding it and see how it goes, and drop it later may be.

    https://sqlroadie.com/

  • The execution plan indicates that the subquery picking up the row with the max date is the most expensive operation.

    If charge_date is unique within each partition of a_site_id/charge_code, then this might be more efficient:

    WITH charges AS (

    SELECT

    c.a_site_id,

    c.charge_code,

    SUM(c.charge_qty) AS SumQty,

    SUM(c.charge_amount) AS TotalRevenue,

    COUNT(DISTINCT c.ub_master_id) AS CountByClaim,

    MAX(c.charge_date) AS Max_charge_date

    FROM ub_charge c

    WHERE EXISTS (

    SELECT 1

    FROM ub_master m

    WHERE c.ub_master_id = m.ub_master_id

    AND m.a_site_id = 1

    AND m.importversion = 1

    AND m.ub_batch_sub_number = 0

    AND m.date_discharged BETWEEN '12/1/2015' AND '11/30/2016'

    )

    GROUP BY c.a_site_id, c.charge_code

    )

    SELECT c.charge_code,

    l.LatestDescription,

    l.LatestAmount,

    c.SumQty,

    c.TotalRevenue,

    c.CountByClaim,

    l.CPTCode,

    l.CPTCode_Description

    FROM charges c

    CROSS APPLY (

    SELECT

    sub.charge_description AS LatestDescription,

    sub.charge_amount / ISNULL(NULLIF(sub.charge_qty,0),1) as LatestAmount,

    sub.CPTCode,

    sub.CPTCode_Description

    FROM ub_charge sub

    WHERE sub.a_site_id = c.a_site_id

    AND sub.charge_code = c.charge_code

    AND sub.charge_date = c.Max_charge_date

    ) l

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • @ChrisM@Work - Your query did run faster, however it is not unique and so it brings up 9,266 rows instead of the 2,570.

    @arjun Sivadasan - I added the index and it is even faster.

    SQL Server Execution Times:

    CPU time = 2732 ms, elapsed time = 1354 ms.

    Regarding the extra rows, I was doing that code and I guess I forgot to mention that part for my criteria of filtering. I need to eliminate any charge codes that the sum(charge_amount) of those that match ub_master_id is 0 or less. So I will need to put that back in.

    I added the latter index and it didn't make it faster, actually slowed it down some and it was big to create, so I will leave that one off.

    I attached the latest sqlplan as well.

  • Well done! I think you only need to add following condition after the group by.

    having sum(charge_amount) > 0

    Edit: Your original query checks for sum(charge_qty) > 0. Do both conditions produce the same results?

    https://sqlroadie.com/

  • douglas.t (1/9/2017)


    @ChrisM@Work - Your query did run faster, however it is not unique and so it brings up 9,266 rows instead of the 2,570.

    Multiple rows per day I guess:

    WITH charges AS (

    SELECT

    c.a_site_id,

    c.charge_code,

    SUM(c.charge_qty) AS SumQty,

    SUM(c.charge_amount) AS TotalRevenue,

    COUNT(DISTINCT c.ub_master_id) AS CountByClaim,

    MAX(c.charge_date) AS Max_charge_date

    FROM ub_charge c

    WHERE EXISTS (

    SELECT 1

    FROM ub_master m

    WHERE c.ub_master_id = m.ub_master_id

    AND m.a_site_id = 1

    AND m.importversion = 1

    AND m.ub_batch_sub_number = 0

    AND m.date_discharged BETWEEN '12/1/2015' AND '11/30/2016'

    )

    GROUP BY c.a_site_id, c.charge_code

    )

    SELECT c.charge_code,

    l.LatestDescription,

    l.LatestAmount,

    c.SumQty,

    c.TotalRevenue,

    c.CountByClaim,

    l.CPTCode,

    l.CPTCode_Description

    FROM charges c

    CROSS APPLY (

    SELECT TOP 1

    sub.charge_description AS LatestDescription,

    sub.charge_amount / ISNULL(NULLIF(sub.charge_qty,0),1) as LatestAmount,

    sub.CPTCode,

    sub.CPTCode_Description

    FROM ub_charge sub

    WHERE sub.a_site_id = c.a_site_id

    AND sub.charge_code = c.charge_code

    AND sub.charge_date = c.Max_charge_date

    ) l

    WHERE c.SumQty > 0

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Arjun Sivadasan (1/9/2017)


    Well done! I think you only need to add following condition after the group by.

    having sum(charge_amount) > 0

    Edit: Your original query checks for sum(charge_qty) > 0. Do both conditions produce the same results?

    Arjun, your query generates different results to mine and the OP's using the test data:

    if 0 = 1 begin;

    DELETE FROM #ub_charge;

    DELETE FROM #ub_master;

    INSERT INTO #ub_master

    (ub_master_id, a_site_id, importversion, ub_batch_sub_number, date_discharged)

    VALUES

    (1, 5, 1, 0, '12/20/2016'),

    (2, 3, 1, 0, '12/21/2016'),

    (3, 3, 1, 1, '01/02/2017'),

    (4, 3, 2, 0, '01/03/2017'),

    (5, 3, 1, 0, '01/01/2017'),

    (6, 3, 1, 0, '01/03/2017');

    INSERT INTO #ub_charge

    (ub_charge_id, ub_master_id, a_site_id, charge_code, charge_amount, charge_qty, charge_date, charge_description, cptcode, cptcode_description)

    VALUES

    (1, 1, 5, '1000', 50, 2, '01/3/2017', 'PRODUCT 1000 V1 (m=1)', 'code1', 'cpt description 1'),

    (2, 1, 5, '2000', 100, 1, '12/17/2016', 'PRODUCT 2000 V1 (m=1)', 'code2', 'cpt description 2'),

    (11, 1, 5, '5000', 90, 2, '12/17/2016', 'PRODUCT 5000 V1 (m=1)', 'code11', 'cpt description 11'),

    (3, 2, 3, '3000', 500, 1, '12/20/2016', 'PRODUCT 3000 V1 (m=2)', 'code3', 'cpt description 3'),

    (4, 2, 3, '2000', 100, 1, '12/19/2016', 'PRODUCT 2000 V2 (m=2)', 'code4', 'cpt description 4'),

    (5, 3, 3, '1000', 30, 1, '1/2/2017', 'PRODUCT 1000 V2 (m=3)', 'code5', 'cpt description 5'),

    (6, 3, 3, '2000', 210, 2, '1/2/2017', 'PRODUCT 2000 V3 (m=3)', 'code6', 'cpt description 6'),

    (7, 4, 3, '3000', 1000, 2, '12/21/2016', 'PRODUCT 3000 V2 (m=4)', 'code7', 'cpt description 7'),

    (8, 4, 3, '4000', 10, 2, '12/20/2016', 'PRODUCT 4000 V1 (m=4)', 'code8', 'cpt description 8'),

    (9, 5, 3, '1000', 100, 4, '11/19/2016', 'PRODUCT 1000 V3 (m=5)', 'code9', 'cpt description 9'),

    (10, 5, 3, '2000', 220, 2, '11/20/2016', 'PRODUCT 2000 V4 (m=5)', 'code10', 'cpt description 10'),

    (12, 5, 3, '2000', 220, 1, '11/20/2016', 'PRODUCT 2000 V5 (m=5)', 'code12', 'cpt description 12'),

    (13, 6, 3, '1000', 100, 1, '01/01/2017', 'PRODUCT 1000 V4 (m=6)', 'code13', 'cpt description 13'),

    (14, 6, 3, '1000', 100, 1, '01/01/2017', 'PRODUCT 1000 V5 (m=6)', 'code14', 'cpt description 14');

    end;

    ------------------------------------------------------------------------------------------------------------

    DECLARE @site_id INT = 3,

    @importversion INT = 1,

    @batchsubnumber INT = 0,

    @datefrom DATETIME = '1jan2017',

    @dateto DATETIME = '1may2017';

    ------------------------------------------------------------------------------------------------------------

    -- Arjun

    WITH cteChargeDesc

    AS

    (

    SELECT charge_code, charge_description, charge_date, charge_amount, charge_qty, cptcode, cptcode_description,

    ROW_NUMBER() OVER(PARTITION BY charge_code ORDER BY charge_date DESC) rn

    FROM #ub_charge

    WHERE a_site_id = @site_id

    )

    ,cteMaster AS

    (

    SELECT ub_master_id

    FROM #ub_master

    WHERE a_site_id = @site_id AND importVersion = @importversion AND ub_batch_sub_number = @batchsubnumber AND date_discharged BETWEEN @datefrom AND @dateto

    )

    SELECT ct.charge_code, cd.charge_description, cd.latest_amount, SUM(ct.charge_qty) [sum_qty], SUM(ct.charge_amount) [total_revenue],

    COUNT(DISTINCT ct.ub_master_id) [count_by_claim], cd.CPTCode, cd.CPTCode_Description

    FROM #ub_charge ct

    INNER JOIN cteMaster m

    ON m.ub_master_id = ct.ub_master_id

    OUTER APPLY

    (

    SELECT icd.charge_description, icd.charge_amount / nullif(icd.charge_qty, 0) [latest_amount], icd.cptcode, icd.cptcode_description

    FROM cteChargeDesc icd

    WHERE icd.charge_code = ct.charge_code AND icd.rn = 1

    ) cd

    GROUP BY ct.charge_code, cd.charge_description, cd.latest_amount, cd.CPTCode, cd.CPTCode_Description

    -------------------------------------------------------------------------------------------------------------------------------

    -- ChrisM

    ;WITH charges AS (

    SELECT

    c.a_site_id,

    c.charge_code,

    SUM(c.charge_qty) AS SumQty,

    SUM(c.charge_amount) AS TotalRevenue,

    COUNT(DISTINCT c.ub_master_id) AS CountByClaim,

    MAX(c.charge_date) AS Max_charge_date

    FROM #ub_charge c

    WHERE EXISTS (

    SELECT 1

    FROM #ub_master m

    WHERE c.ub_master_id = m.ub_master_id

    AND m.a_site_id = @site_id

    AND m.importversion = @importversion

    AND m.ub_batch_sub_number = @batchsubnumber

    AND m.date_discharged BETWEEN @datefrom AND @dateto

    )

    GROUP BY c.a_site_id, c.charge_code

    HAVING SUM(c.charge_qty) > 0

    )

    SELECT c.charge_code,

    l.LatestDescription,

    l.LatestAmount,

    c.SumQty,

    c.TotalRevenue,

    c.CountByClaim,

    l.CPTCode,

    l.CPTCode_Description

    FROM charges c

    CROSS APPLY (

    SELECT TOP 1

    sub.charge_description AS LatestDescription,

    sub.charge_amount / ISNULL(NULLIF(sub.charge_qty,0),1) as LatestAmount,

    sub.CPTCode,

    sub.CPTCode_Description

    FROM #ub_charge sub

    WHERE sub.a_site_id = c.a_site_id

    AND sub.charge_code = c.charge_code

    AND sub.charge_date = c.Max_charge_date

    ) l

    --------------------------------------------------------------------------------------------------

    -- Original

    -- Here I am getting the charges that do not cancel out each other. In the DB I could have a 1 qty and a -1 qty, so I want to eliminate those.

    ;WITH ubc as (

    SELECT c.ub_charge_id from #ub_master m

    JOIN #ub_charge c ON c.ub_master_id = m.ub_master_id

    AND (

    SELECT SUM(csub.charge_qty)

    FROM #ub_charge csub

    WHERE csub.charge_code = c.Charge_code AND

    csub.ub_master_id = m.ub_master_id

    ) > 0

    WHERE

    m.a_site_id = @site_id

    AND m.importversion = @importversion

    AND m.ub_batch_sub_number = @batchsubnumber

    AND m.date_discharged BETWEEN @datefrom AND @dateto

    ),

    -- Here I am using row_number() to limit to the latest Charge Amount and Description

    cte as (

    SELECT * FROM (

    SELECT

    c.ub_master_id,

    c.charge_code,

    -- I am dividing by the charge_qty so that I get the original price per unit

    case when c.charge_qty <> 0 then (c.charge_amount / c.charge_qty) else c.charge_amount END as chargeAmount,

    c.charge_description,

    c.CPTCode,

    c.CPTCode_Description,

    row_number() over(partition by c.charge_code order by c.charge_date desc) as rownum

    from #ub_charge c

    WHERE

    -- Making sure I am getting the latest positive price and not credits

    c.charge_amount >= 0

    AND c.ub_charge_id IN (select ub_charge_id from ubc)

    ) t

    where t.rownum = 1

    )

    --Now fetching the columns and querying the stats that I wanted.

    select

    cte.charge_code,

    cte.charge_description as LatestDescription,

    cte.chargeamount as LatestAmount,

    (select sum(charge_qty) from #ub_charge c

    WHERE c.charge_code = cte.charge_code

    AND c.ub_charge_id IN (select ub_charge_id from ubc)

    ) SumQty,

    (select sum(charge_amount) from #ub_charge c

    WHERE c.charge_code = cte.charge_code

    AND c.ub_charge_id IN (select ub_charge_id from ubc)

    ) TotalRevenue

    ,

    (

    select count( t.charge_code ) from (

    select distinct ub_master_id, charge_code from #ub_charge

    WHERE ub_charge_id IN (select ub_charge_id from ubc)

    AND charge_code = cte.charge_code

    ) t

    ) CountByClaim

    from cte

    In table ub_charge I've put the ub_master_id into the charge_description column so you can see which ub_master_id was in the row, in the output.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Arjun Sivadasan (1/9/2017)


    Well done! I think you only need to add following condition after the group by.

    having sum(charge_amount) > 0

    Edit: Your original query checks for sum(charge_qty) > 0. Do both conditions produce the same results?

    I added that line and it gave me this result count: 2500 (40 less than I need)

    ChrisM@Work (1/10/2017)


    -- ChrisM

    ;WITH charges AS (

    SELECT

    c.a_site_id,

    c.charge_code,

    SUM(c.charge_qty) AS SumQty,

    SUM(c.charge_amount) AS TotalRevenue,

    COUNT(DISTINCT c.ub_master_id) AS CountByClaim,

    MAX(c.charge_date) AS Max_charge_date

    FROM #ub_charge c

    WHERE EXISTS (

    SELECT 1

    FROM #ub_master m

    WHERE c.ub_master_id = m.ub_master_id

    AND m.a_site_id = @site_id

    AND m.importversion = @importversion

    AND m.ub_batch_sub_number = @batchsubnumber

    AND m.date_discharged BETWEEN @datefrom AND @dateto

    )

    GROUP BY c.a_site_id, c.charge_code

    HAVING SUM(c.charge_qty) > 0

    )

    SELECT c.charge_code,

    l.LatestDescription,

    l.LatestAmount,

    c.SumQty,

    c.TotalRevenue,

    c.CountByClaim,

    l.CPTCode,

    l.CPTCode_Description

    FROM charges c

    CROSS APPLY (

    SELECT TOP 1

    sub.charge_description AS LatestDescription,

    sub.charge_amount / ISNULL(NULLIF(sub.charge_qty,0),1) as LatestAmount,

    sub.CPTCode,

    sub.CPTCode_Description

    FROM #ub_charge sub

    WHERE sub.a_site_id = c.a_site_id

    AND sub.charge_code = c.charge_code

    AND sub.charge_date = c.Max_charge_date

    ) l

    This query gives me the correct return count (2540) and does it with these stats:

    SQL Server Execution Times:

    CPU time = 1796 ms, elapsed time = 795 ms.

    Thank you Arjun and ChrisM. I have attached ChrisM sqlplan just in case anyone wants to look at it. You both have been very helpful in different ways. Maybe looking at the SQLPlan (which I cannot read yet very well) you can see if it needs a different index or anything else that will be even quicker, but if not, this is amazing and I am so grateful to all.

  • ChrisM@Work (1/10/2017)


    Arjun, your query generates different results to mine and the OP's using the test data:

    Chris: True! Mine matches with OP's expected output (check the latest_amount and description columns), but I think yours does what the OP actually intended and is definitely simpler, so all good.

    https://sqlroadie.com/

  • douglas.t (1/10/2017)


    Arjun Sivadasan (1/9/2017)


    Well done! I think you only need to add following condition after the group by.

    having sum(charge_amount) > 0

    Edit: Your original query checks for sum(charge_qty) > 0. Do both conditions produce the same results?

    I added that line and it gave me this result count: 2500 (40 less than I need)

    Which condition did you add?

    sum(charge_amount) > 0

    or

    sum(charge_qty) >0

    douglas.t (1/10/2017)


    Thank you Arjun and ChrisM. I have attached ChrisM sqlplan just in case anyone wants to look at it. You both have been very helpful in different ways. Maybe looking at the SQLPlan (which I cannot read yet very well) you can see if it needs a different index or anything else that will be even quicker, but if not, this is amazing and I am so grateful to all.

    Can you please check the latest amount and description columns and confirm if they show the expected values? If that's fine, I think Chris's query is about as good as it gets. It'll be interesting to watch if someone can give us a faster query.

    Execution plan can seem a little cryptic, but make him your best friend and you will have a wonderful relationship 🙂

    https://sqlroadie.com/

Viewing 15 posts - 16 through 30 (of 31 total)

You must be logged in to reply to this topic. Login to reply