Improving performance using partitioning

  • Sorry for the delay.  I finally had the chance to look at your code. 

    Your "simple" code calls a view called GetDatesDiagnostic.  In that view, the C_Date column is a fully calculated column.  The trouble with that is (and as someone else said on your post thread) it must do the calculation for ALL the rows in the view (that's what is meant by "materializing" the view) before the filter in your simple query can work.  Since that view calls on yet another view (normally a really bad idea for the reason already stated), that other view must also be fully materialized.  That's a ton of data and, basically, all rows for all patients must be materialized in both views.  That's a whole lot of work and it'll never be fast until you can whittle down the dates or the patient info a little earlier.  To make matters worse, it's doing thousands of seeks on many of the tables instead of a seek and range scan or just a single row seek.

    Your "simple" query also has both a DISTINCT and a GROUP BY, which inherently does a DISTINCT.

    I don't know if it will actually work because I don't have your data to test with but I believe that a little "Divide'n'Conquer" is in order.  Instead of causing all of the rows in both views to materialize because of the filter attempts on a calculated column in the outer view, let's get and isolate all of the data for the patient and then do the filtering.  We'll do that with a Temp Table and then we'll do the filtering on that.  And, by the way, your date filtering is incorrect.  You don't need to filter out NULLs because they won't be included on anything you filter (unless you're doing all of this from a GUI api where ANSI NULLs are unfortunately NOT in play) and you should NEVER use a time other than 00:00:00.000 when trying to filter whole days because you could miss data when using even 23:59:50.000 because you've left out a full second of the day.

    With all that in mind, here's what I'd try before writing something on my own to get away from those awful views.

     SELECT C_Date
       INTO #MyHead
       FROM dbo.GetDatesDiagnostic 
      WHERE Group_type = 1
        AND PATIENT_ID = '57443'
    ;
     SELECT C_Date
       FROM #MyHead
      WHERE C_Date >= '2017-04-09 00:00:00.000'
        AND C_Date <  '2017-04-11 00:00:00.000' --Less than the next day actually wanted
    ;

    Give it a try and let us know how it works out.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Sorry, I've been away from the keyboard. Looks like Jeff has nailed it. Great advice, especially on the view calling a view thing. That seriously over complicates things and leads to the optimizer making very poor choices.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

Viewing 2 posts - 31 through 32 (of 32 total)

You must be logged in to reply to this topic. Login to reply