Query Optimization Advice

  • Hi All,

    So I'm faced with the need to optimize a pretty hefty query, as it's beentaking extremely long to run. Some optimization that's already been doneinclude the creation of multiple indexes on the involved tables. However, asit's still taking at least an hour or so to run, if it doesn't time outbefore due to memory issues, it still needs to be further optimized. Somethings I've tried are creating a few clustered indexes and statistics, but thishasn't improved performance. Upon analyzing the query further, I noticed theuse of a dozen or so left outer joins. After researching, I learnedthat this doesn't lend for good performance, and so I'm trying to replace themwith outer apply, but the query doesn't seem to be accepting it. Could I pleasetap into the great insight of SQL experts out there to get some advice onhow to go about optimizing this query? I'm attaching the codebelow, because the site wasn't responding well to me posting it here...please ignore the comments...thanks in advance! 

  • The query may run long, but if you could post the actual execution plan as a .sqlplan file that would provide us with additional information.

  • Right here. Instant table/clustered index scans. No seeks. No performance tuning possible other than buying faster disks and more memory:

    [CoverageLengthMonthNumber]<=<= roundround((((DATEDIFFDATEDIFF((DAYDAY,,CASTCAST(LEFT(RIGHT((LEFT(RIGHT([CoverageEffectiveDateKey][CoverageEffectiveDateKey],,44),),22)) ++ '/''/' ++ RIGHT(RIGHT([CoverageEffectiveDateKey][CoverageEffectiveDateKey],,22)) ++ '/''/'                       ++ LEFT(LEFT([CoverageEffectiveDateKey][CoverageEffectiveDateKey],,44)) ASAS DATEDATE),),CASTCAST(LEFT(RIGHT((LEFT(RIGHT(OriginalMaturityDateKeyOriginalMaturityDateKey,,44),),22)) ++ '/''/' ++ RIGHT(RIGHT(OriginalMaturityDateKeyOriginalMaturityDateKey,,22))                       ++ '/''/' ++ LEFT(LEFT(OriginalMaturityDateKeyOriginalMaturityDateKey,,44)) ASAS DATEDATE)))/()))/(365365//1212),),00)+)+5 5 --(1:45m 7.9M)--(1:45m 7.9M)

    I haven't even looked through the rest of the code, but I'll bet it's all problematic like this...
    Yeah, kept looking.... stopped
    You can't have functions like this on your columns in WHERE, HAVING or ON clauses. It prevents statistics use. It prevents appropriate index use. You have this kind of thing all over the place, plus calculations in a GROUP BY, plus more. And then it ends by grouping by about 50 or 60 columns including a bunch of those crazy calculations.

    You can't tune this. I can't tune this.

    This needs to be torn down to it's most component and fundamental parts and then rebuilt. The most important aspect has to be, building appropriate data structures so that you're not running functions like that, grouping on 50 columns, returning 7 pages worth of SELECT. Whatever series of compromises with the business that were done to arrive at this must be completely reassessed. Maybe move the data to a fact table or to a cube. Something, but this isn't working and won't really work, regardless of the indexes you pick.

    Sorry. I really don't mean to sound like the voice of doom or be a negative person, but this isn't something I'd spend time trying to save.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Clustering table  FctCoverageLengthDetail on ( EnrollmentID, CoverageID ) might help, although by how much I can't tell, as the query is obviously very complex.

    If you try that, then if at all possible, also move those columns to the start of the GROUP BY:

    GROUP BY
     FCLD.[EnrollmentID]
    ,FCLD.[CoverageID]
    ,FCLD.[FirstInvoiceDateKey]
    ,FCLD.[OriginalMaturityDateKey]
    ,FCLD.[CoverageEffectiveDateKey]
    ,FCLD.[CoverageEndDateKey]
    ,FCLD.[ProductKey]
    ,...

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Lynn Pettis - Wednesday, November 29, 2017 11:06 AM

    The query may run long, but if you could post the actual execution plan as a .sqlplan file that would provide us with additional information.

    Hi Lynn,

    Sure, please see the attached execution plan. Please let me know if you need any additional info....I appreciate your help.

    Thanks!

  • Grant Fritchey - Wednesday, November 29, 2017 11:56 AM

    Right here. Instant table/clustered index scans. No seeks. No performance tuning possible other than buying faster disks and more memory:

    [CoverageLengthMonthNumber]<=<= roundround((((DATEDIFFDATEDIFF((DAYDAY,,CASTCAST(LEFT(RIGHT((LEFT(RIGHT([CoverageEffectiveDateKey][CoverageEffectiveDateKey],,44),),22)) ++ '/''/' ++ RIGHT(RIGHT([CoverageEffectiveDateKey][CoverageEffectiveDateKey],,22)) ++ '/''/'                       ++ LEFT(LEFT([CoverageEffectiveDateKey][CoverageEffectiveDateKey],,44)) ASAS DATEDATE),),CASTCAST(LEFT(RIGHT((LEFT(RIGHT(OriginalMaturityDateKeyOriginalMaturityDateKey,,44),),22)) ++ '/''/' ++ RIGHT(RIGHT(OriginalMaturityDateKeyOriginalMaturityDateKey,,22))                       ++ '/''/' ++ LEFT(LEFT(OriginalMaturityDateKeyOriginalMaturityDateKey,,44)) ASAS DATEDATE)))/()))/(365365//1212),),00)+)+5 5 --(1:45m 7.9M)--(1:45m 7.9M)

    I haven't even looked through the rest of the code, but I'll bet it's all problematic like this...
    Yeah, kept looking.... stopped
    You can't have functions like this on your columns in WHERE, HAVING or ON clauses. It prevents statistics use. It prevents appropriate index use. You have this kind of thing all over the place, plus calculations in a GROUP BY, plus more. And then it ends by grouping by about 50 or 60 columns including a bunch of those crazy calculations.

    You can't tune this. I can't tune this.

    This needs to be torn down to it's most component and fundamental parts and then rebuilt. The most important aspect has to be, building appropriate data structures so that you're not running functions like that, grouping on 50 columns, returning 7 pages worth of SELECT. Whatever series of compromises with the business that were done to arrive at this must be completely reassessed. Maybe move the data to a fact table or to a cube. Something, but this isn't working and won't really work, regardless of the indexes you pick.

    Sorry. I really don't mean to sound like the voice of doom or be a negative person, but this isn't something I'd spend time trying to save.

    Thanks for your feedback, Grant. I couldn't agree with you more. When I was tasked with this, I couldn't believe how long of a query it was....very complicated and difficult to understand! I've asked for clarification from the business user and still it's not clear....but willing to try anything.

  • daniness - Wednesday, November 29, 2017 12:22 PM

    Thanks for your feedback, Grant. I couldn't agree with you more. When I was tasked with this, I couldn't believe how long of a query it was....very complicated and difficult to understand! I've asked for clarification from the business user and still it's not clear....but willing to try anything.

    If you were on 2016 or better, maybe, MAYBE, columnstore indexes could help, but even that I seriously doubt. I'm positive that no one index (or 10) is going to help here. The problems are too fundamental. Again, apologies for being Debbie Downer.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • daniness - Wednesday, November 29, 2017 10:39 AM

     I noticed theuse of a dozen or so left outer joins. After researching, I learnedthat this doesn't lend for good performance, and so I'm trying to replace themwith outer apply,

    Er, no, no, no!

    Joins SQL has a lot of options for implementing. Apply, since it means run the subquery once per row of the outer query, means a nested loop. Last time I saw a query using apply where it should have been a join, it ran in 13 hours. Change the apply to the correct join, 20 minutes.
    Join and apply are not interchangable. They have different meaning.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Okay, you seriously need to figure out what it is they are expecting and figure out a better way to get it.  You are scanning the same index several times with each returning 15+ million rows of data.  You are also scanning at least one table several times as well.
    Also, what is this doing (break it down into its pieces):

        AND [CoverageLengthMonthNumber] <= round((DATEDIFF(DAY,CAST(LEFT(RIGHT([CoverageEffectiveDateKey],4),2) + '/' + RIGHT([CoverageEffectiveDateKey],2) + '/'
               + LEFT([CoverageEffectiveDateKey],4) AS DATE),CAST(LEFT(RIGHT(OriginalMaturityDateKey,4),2) + '/' + RIGHT(OriginalMaturityDateKey,2)
               + '/' + LEFT(OriginalMaturityDateKey,4) AS DATE)))/(365/12),0)+5 --(1:45m 7.9M)

    And by the way, this won't be a trivial task.  I don't have any idea where to start, but then that is because I only see the query, not the data or the current results (again 15.5+ million rows of data).

  • Lynn Pettis - Wednesday, November 29, 2017 2:23 PM

    Okay, you seriously need to figure out what it is they are expecting and figure out a better way to get it.  You are scanning the same index several times with each returning 15+ million rows of data.  You are also scanning at least one table several times as well.
    Also, what is this doing (break it down into its pieces):

        AND [CoverageLengthMonthNumber] <= round((DATEDIFF(DAY,CAST(LEFT(RIGHT([CoverageEffectiveDateKey],4),2) + '/' + RIGHT([CoverageEffectiveDateKey],2) + '/'
               + LEFT([CoverageEffectiveDateKey],4) AS DATE),CAST(LEFT(RIGHT(OriginalMaturityDateKey,4),2) + '/' + RIGHT(OriginalMaturityDateKey,2)
               + '/' + LEFT(OriginalMaturityDateKey,4) AS DATE)))/(365/12),0)+5 --(1:45m 7.9M)

    And by the way, this won't be a trivial task.  I don't have any idea where to start, but then that is because I only see the query, not the data or the current results (again 15.5+ million rows of data).

    Thank you, Lynn...I'm still trying to decode this, but I appreciate your input :-).

  • GilaMonster - Wednesday, November 29, 2017 12:48 PM

    daniness - Wednesday, November 29, 2017 10:39 AM

     I noticed theuse of a dozen or so left outer joins. After researching, I learnedthat this doesn't lend for good performance, and so I'm trying to replace themwith outer apply,

    Er, no, no, no!

    Joins SQL has a lot of options for implementing. Apply, since it means run the subquery once per row of the outer query, means a nested loop. Last time I saw a query using apply where it should have been a join, it ran in 13 hours. Change the apply to the correct join, 20 minutes.
    Join and apply are not interchangable. They have different meaning.

    Thank you for your input, Gail! 🙂

  • Are you willing to try changing the Clustering on table FctCoverageLengthDetail as I suggested earlier?  I'm just curious as to how much it would help.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Hi ScottPletcher,

    Thanks for the suggestion....sorry I meant to reply to your earlier post. I will try changing the clustering and report back with the results. Thank you :-).

  • From somewhere inside the query:

      FROM [dbo].[FctCoverageLengthDetail]

    WHERE CASE

         WHEN([CoverageLengthMonthNumber]=13 or[CoverageLengthMonthNumber] =25 or[CoverageLengthMonthNumber] =37 or[CoverageLengthMonthNumber] =49

         or[CoverageLengthMonthNumber] =61 or[CoverageLengthMonthNumber] =73 or[CoverageLengthMonthNumber] =85 or[CoverageLengthMonthNumber] =97

         or[CoverageLengthMonthNumber] =109 or[CoverageLengthMonthNumber] =121 )

         AND[CoverageLengthMonthNumber] <(round((DATEDIFF(DAY,CAST(LEFT(RIGHT([CoverageEffectiveDateKey],4),2)+'/'+RIGHT([CoverageEffectiveDateKey],2)

          +'/'+LEFT([CoverageEffectiveDateKey],4)ASDATE),CAST(LEFT(RIGHT([CoverageEndDateKey],4),2)+'/'+RIGHT([CoverageEndDateKey],2)+'/'

          +LEFT([CoverageEndDateKey],4)ASDATE)))/(365/12),0))

         AND(round((DATEDIFF(DAY,CAST(LEFT(RIGHT([CoverageEffectiveDateKey],4),2)+'/'+RIGHT([CoverageEffectiveDateKey],2)+'/'

          +LEFT([CoverageEffectiveDateKey],4)ASDATE),CAST(LEFT(RIGHT([CoverageEndDateKey],4),2)+'/'+RIGHT([CoverageEndDateKey],2)+'/'

          +LEFT([CoverageEndDateKey],4)ASDATE)))/(365/12),0))

          -[CoverageLengthMonthNumber]<13

         THEN[CoverageLengthMonthNumber]

        ELSE 1

        END is not null

    To me it like the only way the expression may be NULL is when CoverageLengthMonthNumber is null.
    So, you can safely ditch the whole horrible CASE statement in WHERE clause and replace it with

    WHERE [CoverageLengthMonthNumber] is not null

    This is only one randomly picked point of improvement.

    _____________
    Code for TallyGenerator

  • ScottPletcher - Thursday, November 30, 2017 12:36 PM

    Are you willing to try changing the Clustering on table FctCoverageLengthDetail as I suggested earlier?  I'm just curious as to how much it would help.

    Hi ScottPletcher,
    So I tried your suggestion of the clustered index on the FctCoverageLengthDetail table, but I don't believe it improved performance, as the query took about 29 minutes to run, returning 15 M or so records. Thank you for the feedback though :-).

Viewing 15 posts - 1 through 15 (of 25 total)

You must be logged in to reply to this topic. Login to reply