Home Forums SQL Server 2008 T-SQL (SS2K8) Convert the stored procedure to "Standard SQL Select" RE: Convert the stored procedure to "Standard SQL Select"

  • halifaxdal (4/12/2014)


    Thank you Jeff, I appreciate your enlightening, here is what I got so far:

    SELECTB = Sum(Case When

    DocStatus = 'Active'

    Then 1 Else 0 End),

    C = Sum(Case When

    DocStatus = 'Active' and

    CAST(YEAR(TargetClosureDate) AS int) = dbo.fnFiscalYear(getdate())

    Then 1 Else 0 End),

    D = Sum(Case When

    DocStatus = 'Active' and

    CAST(YEAR(TargetClosureDate) AS int) > dbo.fnFiscalYear(getdate())

    Then 1 Else 0 End),

    A = Sum(Case When

    DocStatus = 'Retired' and

    CAST(YEAR(RetiredDate) AS int) = (dbo.fnFiscalYear(getdate())-1)

    Then 1 Else 0 End),

    Critical = Sum(Case When

    DocStatus = 'Active' and

    RiskRating = 4

    Then 1 Else 0 End),

    High = Sum(Case When

    DocStatus = 'Active' and

    RiskRating = 3

    Then 1 Else 0 End),

    Medium = Sum(Case When

    DocStatus = 'Active' and

    RiskRating = 2

    Then 1 Else 0 End),

    Low = Sum(Case When

    DocStatus = 'Active' and

    RiskRating = 1

    Then 1 Else 0 End),

    TBD = Sum(Case When

    DocStatus = 'Active' and

    RiskRating = 0

    Then 1 Else 0 End),

    BCD =Sum(Case When

    DocStatus = 'Active'

    Then 1 Else 0 End) ----B

    +

    Sum(Case When

    DocStatus = 'Active' and

    CAST(YEAR(TargetClosureDate) AS int) = dbo.fnFiscalYear(getdate())

    Then 1 Else 0 End)-----C

    +

    Sum(Case When

    DocStatus = 'Active' and

    CAST(YEAR(TargetClosureDate) AS int) > dbo.fnFiscalYear(getdate())

    Then 1 Else 0 End)

    FROM Document d

    left outer join ITSDivision i on d.ITSDivisionID = i.ID

    Where

    i.Division = @SVP and

    DocType = 1

    It returns all values for a specific @SVP except the DRR which is a percentage DRR = A/(B+C+D), can you help? Thanks.

    Heh... rush, rush, rush. No one wants to peel just one potato at a time and that's why you're having problems with the DRR calculation and why your code will run slower than it needs to. 😉

    You did real well in realizing that it wasn't a COUNT that you wanted and that you wanted to do conditional SUMs of 1's, instead. That technique is known as a CROSSTAB and you can read more about it at the following two links.

    http://www.sqlservercentral.com/articles/T-SQL/63681/

    http://www.sqlservercentral.com/articles/Crosstab/65048/

    The reason why I accused you of not peeling one potato at a time is because you've had to do a recalculation of B, C, and D to calculate the value of the BCD column right now. As a result, your code is unneccessarily long and the extra calculations could make it run a fair bit slower unless the Optimizer recognizes those formulas as the same formulas previously calculated and just happens to reuse them.

    With the idea that we want to make the code do as few duplicate calculations as necessary for reasons of performance and to make the code both easier to read and troubleshoot, let's begin the simplification process...

    As you pointed out, your code returns all the values for a specific @SVP. Another way of saying that is that it returns all of the values for JUST ONE @SVP. If we look back at the original cursor code, @SVP isn't a parameter... it's a variable populated by the cursor and then it loops through each and every value of the ITSDivision table just to find the ITSDivisionID... WHICH IS ALREADY AVAILABLE IN THE DOCUMENT TABLE AND IT DOES A LEFT JOIN ON THE DOCUMENT TABLE. That at least implies that you want everything in the document table regardless. Also notice that the DIVISION column (presumably a name column) isn't actually used anywhere except in the final output, so let's get rid of the join to the ITSDivision table and GROUP BY the ITSDivisionID column of the document table and get rid of the join altogether. It's just not needed, yet, and there's no sense in doing the join on so many rows as what are in the Document table.

    Another simplification is that there's actually no need to CAST the return value of YEAR to an INT. It's guaranteed to be an INT, so casting it just burns clock cycles without adding any value.

    To get rid of repetative calculations, I removed the calculation for BCD. We'll add it in later just like we will for the "Division" and DRR columns... all of which will be incredibly simple because we're simplifying the base code.

    I also shortened the length of the code and did some nice vertical alignment to make it easier to read. We still have some more simplification and alignment to do but, if we apply the simplifications that we have so far, here's what we end up with. Note that I don't have your tables or data so I'm doing this by "feel" and I've not actually tested the code.

    SELECTITSDivisionID

    ,A = Sum(Case When DocStatus = 'Retired' and YEAR(RetiredDate) = (dbo.fnFiscalYear(getdate())-1) Then 1 Else 0 End)

    ,B = Sum(Case When DocStatus = 'Active' Then 1 Else 0 End)

    ,C = Sum(Case When DocStatus = 'Active' and YEAR(TargetClosureDate) = dbo.fnFiscalYear(getdate()) Then 1 Else 0 End)

    ,D = Sum(Case When DocStatus = 'Active' and YEAR(TargetClosureDate) > dbo.fnFiscalYear(getdate()) Then 1 Else 0 End)

    ,Critical = Sum(Case When DocStatus = 'Active' and RiskRating = 4 Then 1 Else 0 End)

    ,High = Sum(Case When DocStatus = 'Active' and RiskRating = 3 Then 1 Else 0 End)

    ,Medium = Sum(Case When DocStatus = 'Active' and RiskRating = 2 Then 1 Else 0 End)

    ,Low = Sum(Case When DocStatus = 'Active' and RiskRating = 1 Then 1 Else 0 End)

    ,TBD = Sum(Case When DocStatus = 'Active' and RiskRating = 0 Then 1 Else 0 End)

    FROM dbo.Document

    WHERE DocType = 1

    GROUP BY d.ITSDivisionID

    ;

    The next simplification that will help this code fly is to remove the year "calculations". Simple comparisons with no calculations to the left of the relational operator can be a fair bit faster than comparisons with equations on both sides of the relational operator. We also don't need to calculate the Fiscal year more than once. If we get lucky, the Optimizer will recognize the function calculation as a "run-time constant". If we don't get so lucky, it'll be doing a whole bunch of duplicated calculations. With that in mind and assuming that we have to write this as a direct replacement for a stored procedure instead of an iTVF (Inline Table Valued Function) or a view, we can make it so that we only do a couple of calculations for fiscal year for the whole proc instead of once for each row. With the idea that we don't want to have to calculate the integer based year for multiple times for every row in the Document table, we're going to do something a little different and not use the function, at all (and there's a HUGE wakup call coming, as well).

    The first step in all of that is to calculate the start dates (1st of November for each respective year) for each required fiscal year... like this.

    --===== Declare local variables for the start date of fiscal years

    DECLARE @FYStart DATETIME --November 1st of current FY

    ,@FYNextStart DATETIME --November 1st of next FY

    ,@FYPrevStart DATETIME --November 1st of previous FY

    ;

    --===== Calculate the start date of each named fiscal year

    SELECT @FYStart = DateAdd(mm,-2,DateAdd(yy,DateDiff(yy,0,DateAdd(mm,2,Getdate())),0))

    ,@FYNextStart = DateAdd(yy,1,@FYStart)

    ,@FYPrevStart = DateAdd(yy,-1,@FYStart)

    ;

    Now, we can remove some of those nasty ol' CPU eating calculations from the code and replace them with simple comparisons.

    --===== Declare local variables for the start date of fiscal years

    DECLARE @FYStart DATETIME --November 1st of current FY

    ,@FYNextStart DATETIME --November 1st of next FY

    ,@FYPrevStart DATETIME --November 1st of previous FY

    ;

    --===== Calculate the start date of each named fiscal year

    SELECT @FYStart = DateAdd(mm,-2,DateAdd(yy,DateDiff(yy,0,DateAdd(mm,2,Getdate())),0))

    ,@FYNextStart = DateAdd(yy, 1,@FYStart)

    ,@FYPrevStart = DateAdd(yy,-1,@FYStart)

    ;

    SELECTITSDivisionID

    ,A = Sum(Case When DocStatus = 'Retired' and RetiredDate >= @FYPrevStart and RetiredDate < @FYStart Then 1 Else 0 End)

    ,B = Sum(Case When DocStatus = 'Active' Then 1 Else 0 End)

    ,C = Sum(Case When DocStatus = 'Active' and TargetClosureDate >= @FYStart and TargetClosureDate < @FYNextStart Then 1 Else 0 End)

    ,D = Sum(Case When DocStatus = 'Active' and TargetClosureDate >= @FYNextStart Then 1 Else 0 End)

    ,Critical = Sum(Case When DocStatus = 'Active' and RiskRating = 4 Then 1 Else 0 End)

    ,High = Sum(Case When DocStatus = 'Active' and RiskRating = 3 Then 1 Else 0 End)

    ,Medium = Sum(Case When DocStatus = 'Active' and RiskRating = 2 Then 1 Else 0 End)

    ,Low = Sum(Case When DocStatus = 'Active' and RiskRating = 1 Then 1 Else 0 End)

    ,TBD = Sum(Case When DocStatus = 'Active' and RiskRating = 0 Then 1 Else 0 End)

    FROM dbo.Document

    WHERE DocType = 1

    AND DocStatus IN ('Active','Retired')

    GROUP BY ITSDivisionID

    ;

    [font="Arial Black"]HERE'S THE HUGE WAKEUP CALL!!!. [/font]Let's look at the way they did the date calculations in the original code...

    YEAR(TargetClosureDate) = dbo.fnFiscalYear(getdate())

    Do you see anything wrong with that code? Anything at all? Let's do a wee bit o' analysis. If GETDATE() is 2014-04-12, then the function will return the number 2014 as the fiscal year. All of the dates for TargetClosureDate must be in the same year of 2014. BUT, BECAUSE THEY DIDN'T APPLY THE SAME FUNCTION TO THE TARGETCLOSUREDATE COLUMN, [font="Arial Black"]IT'S RETURNING THE CALENDAR YEAR INSTEAD OF THE FISCAL YEAR!!!!! [/font]WHOOPS!!! How long has that problem been in place with the original code????

    The method that I used correctly calculates the start date of fiscal years and then compares the dates for columns like TargetClosureDate against the actual dates in a fiscal year so the calculation is actually done correctly. This is what happens when people try to take such shortcuts. NEVER take shortcuts where date ranges are involved. ALWAYS (and I don't say that word often) follow the basic format for doing the comparisons as in the following...

    WHERE SomeDateColumn >= SomePeriodStartDate

    AND SomeDateColumn < NextPeriodStartDate

    The method above is also a hidden performance enhancement because it doesn't have to calculate a value for each and ever bloody SomeDateColumn in a given table.

    Don'cha just love surprises? 😛

    Now... how to calculate BCD, DRR, and retrieve the value of the related "Division" column WITHOUT repeating a bunch of calculations? I'm going to let you think about that a bit with the following hint... Do you know what CTEs are?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)