Better TSQL Performance

  • Hi everyone, 

    I am trying to improve query performance, I have encapsulated the following query into a stored procedure and have made some structural changes, added indices, etc (I an inherited a poorly designed db), but I feel I can do more, looking for advice:  Here is the proc

    IF OBJECT_ID ( 'dbo.MA_GET_LOAN_FUNDED_MARKET', 'P' ) IS NOT NULL 
      DROP PROCEDURE dbo.MA_GET_LOAN_FUNDED_MARKET;
    GO

    CREATE PROCEDURE dbo.MA_GET_LOAN_FUNDED_MARKET 
      @LoanFundedMarket nvarchar(50), 
      @AppVolMarket nvarchar(50) 
    AS

    BEGIN

      SET NOCOUNT ON;

    SELECT DISTINCT Cast(currentYear.respondent AS BIGINT) AS respondent,
          appvol,
          c.institution_name       AS Institution_Name,
          marketsharebyno,
          approved,
          denied,
          funded,
          approvedbyapp,
          fundedbyapp,
          fundedbyapproved,
          currentYear.volfundeddollar,
          MarketShareDollar = CASE
                 WHEN @LoanFundedMarket > 0 THEN
                 currentYear.volfundeddollar /
                 @LoanFundedMarket
                 ELSE 0
                END,    
          GrowthByVolFunded = CASE
                 WHEN priorYear.volfundeddollar > 0 THEN
          (
          currentYear.volfundeddollar - priorYear.volfundeddollar ) / priorYear.volfundeddollar
          ELSE 0
                END,
          Cast(currentYear.respondent AS BIGINT) AS respondent,
          Rank()
          OVER (
           ORDER BY appvol DESC)     AS ranking
    FROM (SELECT Cast(instdtnew_2016.respondent AS BIGINT) AS respondent,
         Sum(noofapp)          AS appVol,
         instdtnew_2016.agency,
         Sum(Isnull(approved, 0))      AS approved,
         Sum(denied)          AS Denied,
         Sum(funded)          AS Funded,
         ApprovedByApp = CASE
               WHEN Sum(noofapp) > 0 THEN
               Sum(approved) / Sum(Cast(
               noofapp AS FLOAT))
               ELSE 0
               END,
         FundedByApp = CASE
               WHEN Sum(noofapp) > 0 THEN
               Sum(funded) / Sum(Cast(
               noofapp AS FLOAT))
               ELSE 0
              END,
         FundedByApproved = CASE
                WHEN Sum(approved) > 0 THEN
                Sum(funded) / Sum(
                Cast(approved AS FLOAT))
                ELSE 0
                END,
         MarketSharebyNo = CASE
                WHEN @AppVolMarket > 0 THEN
                Sum(noofapp) / @AppVolMarket
                ELSE 0
               END,
         Sum(volfundeddollar)       AS VolFundedDollar,
         MarketShareDollar = CASE
                 WHEN @LoanFundedMarket > 0 THEN (
                 Sum(volfundeddollar) / @LoanFundedMarket )
                 ELSE 0
                END
       FROM instdtnew_2016
         INNER JOIN ma_respondent_lookup
            ON instdtnew_2016.respondent =
             ma_respondent_lookup.respondent
             AND instdtnew_2016.agency =
              ma_respondent_lookup.agency
       WHERE (( instdtnew_2016.state IN ( 2, 1, 5, 4,
                   6, 8, 9, 11,
                   10, 12, 13, 15,
                   19, 16, 17, 18,
                   20, 21, 22, 25,
                   24, 23, 26, 27,
                   29, 28, 30, 37,
                   38, 31, 33, 34,
                   35, 32, 36, 39,
                   40, 41, 42, 72,
                   44, 45, 46, 47,
                   48, 49, 51, 50,
                   53, 55, 54, 56 ) ))
       GROUP BY instdtnew_2016.respondent,
          instdtnew_2016.agency) currentYear
       LEFT JOIN (SELECT Cast(instdtnew_2015.respondent AS BIGINT) AS respondent,
             instdtnew_2015.agency       AS Agency,
             Sum(Cast(volfundeddollar AS FLOAT))  AS
             VolFundedDollar
          FROM instdtnew_2015
          WHERE (( instdtnew_2015.state IN ( 2, 1, 5, 4,
                      6, 8, 9, 11,
                      10, 12, 13, 15,
                      19, 16, 17, 18,
                      20, 21, 22, 25,
                      24, 23, 26, 27,
                      29, 28, 30, 37,
                      38, 31, 33, 34,
                      35, 32, 36, 39,
                      40, 41, 42, 72,
                      44, 45, 46, 47,
                      48, 49, 51, 50,
                      53, 55, 54, 56 ) ))
          GROUP BY instdtnew_2015.respondent,
              instdtnew_2015.agency) priorYear
         ON Cast(currentYear.respondent AS BIGINT) = Cast(
          priorYear.respondent AS BIGINT)
          AND currentYear.agency = priorYear.agency
       INNER JOIN (SELECT Cast(a.respondent AS BIGINT) AS respondent,
             a.agency,
             CASE
              WHEN a.cds_id IS NOT NULL
               AND b.charter_type = 'Federal'
               AND Upper(a.institution_name) NOT LIKE
                 '%CREDIT UNION%'
               AND Upper(a.institution_name) NOT LIKE '%CU'
             THEN
              a.institution_name + 'FCU'
              WHEN a.cds_id IS NOT NULL
               AND b.charter_type = 'State'
               AND Upper(a.institution_name) NOT LIKE
                 '%CREDIT UNION%'
               AND Upper(a.institution_name) NOT LIKE '%CU'
             THEN
              a.institution_name + 'CU'
              ELSE a.institution_name
             END         AS institution_name
           FROM [dbo].[ma_vw_respondents] a
             LEFT OUTER JOIN header b
                 ON a.cds_id = b.cds_recnum) c
         ON Cast(currentYear.respondent AS BIGINT) = Cast(
          c.respondent AS BIGINT)
          AND c.agency = currentYear.agency
    ORDER BY marketsharebyno DESC

    END

    GO

    ---------------------------------------------------------

    Thoughts, comments, advice appreciated. Anything I might be missing here?

    TIA,

    PMD

  • First thing first, lets format your sp in a readable form so that if someone interested to help, he/she could easily read the code.


    SELECT DISTINCT Cast(currentYear.respondent AS BIGINT) AS respondent,
          appvol,
          c.institution_name       AS Institution_Name,
          marketsharebyno,
          approved,
          denied,
          funded,
          approvedbyapp,
          fundedbyapp,
          fundedbyapproved,
          currentYear.volfundeddollar,
          MarketShareDollar = CASE
                 WHEN @LoanFundedMarket > 0 THEN
                 currentYear.volfundeddollar /
                 @LoanFundedMarket
                 ELSE 0
                END,
          GrowthByVolFunded = CASE
                 WHEN priorYear.volfundeddollar > 0 THEN
                 (
                 currentYear.volfundeddollar -
                 priorYear.volfundeddollar )
                 /
          priorYear.volfundeddollar
          ELSE 0
          END,
          Cast(currentYear.respondent AS BIGINT) AS respondent,
          Rank()
          OVER (
           ORDER BY appvol DESC )     AS ranking
    FROM (SELECT Cast(instdtnew_2016.respondent AS BIGINT) AS respondent,
         instdtnew_2016.agency,
         Sum(noofapp)          AS appVol,
         Sum(Isnull(approved, 0))      AS approved,
         Sum(denied)           AS Denied,
         Sum(funded)           AS Funded,
         ApprovedByApp = CASE
               WHEN Sum(noofapp) > 0 THEN
               Sum(approved) / Sum(Cast(
               noofapp AS FLOAT))
               ELSE 0
               END,
         FundedByApp = CASE
               WHEN Sum(noofapp) > 0 THEN
               Sum(funded) / Sum(Cast(
               noofapp AS FLOAT))
               ELSE 0
              END,
         FundedByApproved = CASE
                WHEN Sum(approved) > 0 THEN
                Sum(funded) / Sum(
                Cast(approved AS FLOAT))
                ELSE 0
                END,
         MarketSharebyNo = CASE
                WHEN @AppVolMarket > 0 THEN
                Sum(noofapp) / @AppVolMarket
                ELSE 0
               END,
         Sum(volfundeddollar)        AS VolFundedDollar,
         MarketShareDollar = CASE
                 WHEN @LoanFundedMarket > 0 THEN (
                 Sum(volfundeddollar) / @LoanFundedMarket )
                 ELSE 0
                END
       FROM instdtnew_2016
         INNER JOIN ma_respondent_lookup
            ON instdtnew_2016.respondent =
             ma_respondent_lookup.respondent
             AND instdtnew_2016.agency =
              ma_respondent_lookup.agency
       WHERE (( instdtnew_2016.state IN ( 2, 1, 5, 4,
                   6, 8, 9, 11,
                   10, 12, 13, 15,
                   19, 16, 17, 18,
                   20, 21, 22, 25,
                   24, 23, 26, 27,
                   29, 28, 30, 37,
                   38, 31, 33, 34,
                   35, 32, 36, 39,
                   40, 41, 42, 72,
                   44, 45, 46, 47,
                   48, 49, 51, 50,
                   53, 55, 54, 56 ) ))
       GROUP BY instdtnew_2016.respondent,
          instdtnew_2016.agency) currentYear
       LEFT JOIN (SELECT Cast(instdtnew_2015.respondent AS BIGINT) AS respondent
             ,
             instdtnew_2015.agency
             AS Agency,
             Sum(Cast(volfundeddollar AS FLOAT))   AS
             VolFundedDollar
          FROM instdtnew_2015
          WHERE (( instdtnew_2015.state IN ( 2, 1, 5, 4,
                      6, 8, 9, 11,
                      10, 12, 13, 15,
                      19, 16, 17, 18,
                      20, 21, 22, 25,
                      24, 23, 26, 27,
                      29, 28, 30, 37,
                      38, 31, 33, 34,
                      35, 32, 36, 39,
                      40, 41, 42, 72,
                      44, 45, 46, 47,
                      48, 49, 51, 50,
                      53, 55, 54, 56 ) ))
          GROUP BY instdtnew_2015.respondent,
              instdtnew_2015.agency) priorYear
         ON Cast(currentYear.respondent AS BIGINT) = Cast(
          priorYear.respondent AS BIGINT)
          AND currentYear.agency = priorYear.agency
       INNER JOIN (SELECT Cast(a.respondent AS BIGINT) AS respondent,
             a.agency,
             CASE
              WHEN a.cds_id IS NOT NULL
               AND b.charter_type = 'Federal'
               AND Upper(a.institution_name) NOT LIKE
                 '%CREDIT UNION%'
               AND Upper(a.institution_name) NOT LIKE '%CU'
             THEN
              a.institution_name + 'FCU'
              WHEN a.cds_id IS NOT NULL
               AND b.charter_type = 'State'
               AND Upper(a.institution_name) NOT LIKE
                 '%CREDIT UNION%'
               AND Upper(a.institution_name) NOT LIKE '%CU'
             THEN
              a.institution_name + 'CU'
              ELSE a.institution_name
             END         AS institution_name
           FROM [dbo].[ma_vw_respondents] a
             LEFT JOIN header b
               ON a.cds_id = b.cds_recnum) c
         ON Cast(currentYear.respondent AS BIGINT) = Cast(
          c.respondent AS BIGINT)
          AND c.agency = currentYear.agency
    ORDER BY marketsharebyno DESC;

  • See that DISTINCT?  Figure out which join(s) are causing the need for that and fix it.  That will almost certainly be the root of your performance problem.  If you want deeper help on such a performance problem, please see the second link under "Helpful Links" for what else we need to help here.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Seems like you have manual partition and your report is a comparison between this two years. Which means you have fair amount of data to play with.
    How much data do you have in these tables and how much data is being return once you execute this SP?

    Following are few things which you can start your working on.

    1. Did you check which part of query is taking long time? By just looking at your code you are using Cast while joining the table.
        Which is might be hurting you. You should be joining them without Casting and if you don't have index on it then i would suggest to create index on it.
    2. Instead of doing Aggregation and Decision at the same time, it is always better to do PRE-AGGREGATION and then do the CASE Statement Something like this:

    SELECT
        a.respondent
        , a.agency
        , Calc.ApprovedByApp
        , calc.FundedByApp
        , Calc.FundedByApproved
        , Calc.MarketSharebyNo
        , Calc.MarketShareDollar
    FROM
    (
            SELECT
                    FT16.respondent
                    , FT16.agency
                    
                    , ISNULL(Sum(noofapp) ,0)    AS noofapp,
                    , ISNULL(Sum(approved),0)            AS approved,
                    , ISNULL(Sum(denied),0)     AS Denied,
                    , ISNULL(Sum(funded),0)     AS Funded,                
                    , ISNULL(Sum(volfundeddollar),0)    AS volfundeddollar

       FROM instdtnew_2016 AS FT16
         INNER JOIN ma_respondent_lookup lkp
            ON FT16.respondent = lkp.respondent
             AND FT16.agency = lkp.agency
       WHERE (( FT16.state IN ( 2, 1, 5, 4,
                   6, 8, 9, 11,
                   10, 12, 13, 15,
                   19, 16, 17, 18,
                   20, 21, 22, 25,
                   24, 23, 26, 27,
                   29, 28, 30, 37,
                   38, 31, 33, 34,
                   35, 32, 36, 39,
                   40, 41, 42, 72,
                   44, 45, 46, 47,
                   48, 49, 51, 50,
                   53, 55, 54, 56 ) ))

            GROUP BY
                FT16.respondent, FT16.agency
    ) A
    CROSS APPLY
    (
        SELECT

                     ApprovedByApp     = CASE    WHEN noofapp > 0            THEN approved            / noofapp        * 1.0    ELSE 0 END,
                     FundedByApp         = CASE    WHEN noofapp > 0            THEN funded                / noofapp        * 1.0    ELSE 0 END,
                     FundedByApproved = CASE    WHEN approved > 0            THEN funded                / approved        * 1.0    ELSE 0 END,
                     MarketSharebyNo     = CASE    WHEN @AppVolMarket > 0        THEN noofapp            / @AppVolMarket * 1.0    ELSE 0 END,                
                     MarketShareDollar = CASE WHEN @LoanFundedMarket > 0    THEN volfundeddollar    / @LoanFundedMarket        ELSE 0 END
    ) Calc

    3. You are also using Distinct, which means your data is getting duplicated due to a join or something. Did you check which part of your query is generating duplicate entries? it seems that your Last DriveTable in your query might be causing this issue. Try to get ride of these duplicates and Remove the DISTINCT Clause from the SELECT.

  • I'd probably move this stuff:
    FT16..state IN state IN (( 22,, 11,, 55,, 44,,
                                    66,, 88,, 99,, 1111,,
                                    1010,, 1212,, 1313,, 1515,,
                                    1919,, 1616,, 1717,, 1818,,
                                    2020,, 2121,, 2222,, 2525,,
                                    2424,, 2323,, 2626,, 2727,,
                                    2929,, 2828,, 3030,, 3737,,
                                    3838,, 3131,, 3333,, 3434,,
                                    3535,, 3232,, 3636,, 3939,,
                                    4040,, 4141,, 4242,, 7272,,
                                    4444,, 4545,, 4646,, 4747,,
                                    4848,, 4949,, 5151,, 5050,,
                                    5353,, 5555,, 5454,, 5656

    Into a temp table, load it and do a join on it instead of an IN clause. Test it.

    Calculations like this on columns in filtering clauses (WHERE, HAVING, ON) are going to absolutely kill performance:
    ON CastCast((currentYearcurrentYear..respondent AS BIGINTrespondent AS BIGINT)) == CastCast((
           priorYear       priorYear..respondent AS BIGINTrespondent AS BIGINT))

    If those are different data types, to fix performance, you may need to change the structure.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Definitely would help to have the DDL (CREATE TABLE statement) for each of the tables used in the query.  Would also help to see the execution plan for the query, uploaded as a .sqlplan file.
    It looks like there are quite a few things that could be changed, but we don't have the information needed to really help at the moment.

  • Thank you all for your responses.  All have been very helpful.  And forgive me, I lost my formatting somewhere between copy & paste, I will ensure its formatted better in the future.  I will take the comments into consideration and see if that helps. 

    PMD

  • PrettyMegaDBA - Wednesday, November 22, 2017 8:16 AM

    Thank you all for your responses.  All have been very helpful.  And forgive me, I lost my formatting somewhere between copy & paste, I will ensure its formatted better in the future.  I will take the comments into consideration and see if that helps. 

    PMD

    The formatting thing has been a problem ever since they "improved" the forum software.  :angry:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Wednesday, November 22, 2017 8:20 AM

    PrettyMegaDBA - Wednesday, November 22, 2017 8:16 AM

    Thank you all for your responses.  All have been very helpful.  And forgive me, I lost my formatting somewhere between copy & paste, I will ensure its formatted better in the future.  I will take the comments into consideration and see if that helps. 

    PMD

    The formatting thing has been a problem ever since they "improved" the forum software.  :angry:

    You can't blame this one on the forum software.  The code was not included in the proper SQL Code tags.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Thanks Jeff.  And go easy on me J. Drew, I don't post very often scripts for assistance, and as I've mentioned I inherited this code and DB.  Clearly, there are quite a few issues with the code.  Some db structure related and some just poor coding.  But I knew the experts here could advise.  I will try to do better in the future, no need to bash.

    PMD

  • PrettyMegaDBA - Wednesday, November 22, 2017 8:36 AM

    Thanks Jeff.  And go easy on me J. Drew, I don't post very often scripts for assistance, and as I've mentioned I inherited this code and DB.  Clearly, there are quite a few issues with the code.  Some db structure related and some just poor coding.  But I knew the experts here could advise.  I will try to do better in the future, no need to bash.

    PMD

    I'm sure he wasn't trying to bash. Jeff pointed out that since the new forum software was implemented, we all have trouble with formatting. However, if the code is not included in the SQL Code tags, all format is loss and that has always happened.
    This is something that happens a lot, especially to people that are new and we dislike it, but understand it. As long as you try to do better next time, we can all remain friendly in here.
    Also, this is a helpful guide to post good performance related questions: How to Post Performance Problems - SQLServerCentral

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Ok, I hope I am doing this correctly.  I am providing the Execution Plan as .xml, and DDL of the tables and view in the queries.  There are no, indices, no PK's and no FK's.  Before you all go nuts, I've already smacked the hands.  They know it's crap, I know it's crap, we know it's crap. But as I mentioned, I've inherited it this way and unfortunately, it's in production.   But here is some additional background info.

    1. The data is loaded into a db yearly from a .csv provided from the gooberment (yes I spelled it that way, because their goobers 😉
    2. I already know it is poorly designed and have addressed those issues with the team.
    3. The db will be redesigned and the beginning of the year.
    4. In the meantime, the current db is in production, not very large and doesn't have too many users.

    Here are my number of rows returned and timing:

    --18 secs to return 6761 rows (before encapsulated in proc)
    --15 secs to return 6761 rows (after encapsulation)
    --13 secs to return 6761 rows (by using bigint instead of float - making ddl change on table)

    Thanks again for all of your very helpful suggestions!

  • Sorry all, I provided EP as .sql-plan not .xml

    No hard feelings here, and thanks for the tips on how to properly posts perf issues.  

    PMD

  • PrettyMegaDBA - Wednesday, November 22, 2017 9:11 AM

    Ok, I hope I am doing this correctly.  I am providing the Execution Plan as .xml, and DDL of the tables and view in the queries.  There are no, indices, no PK's and no FK's.  Before you all go nuts, I've already smacked the hands.  They know it's crap, I know it's crap, we know it's crap. But as I mentioned, I've inherited it this way and unfortunately, it's in production.   But here is some additional background info.

    1. The data is loaded into a db yearly from a .csv provided from the gooberment (yes I spelled it that way, because their goobers 😉
    2. I already know it is poorly designed and have addressed those issues with the team.
    3. The db will be redesigned and the beginning of the year.
    4. In the meantime, the current db is in production, not very large and doesn't have too many users.

    Here are my number of rows returned and timing:

    --18 secs to return 6761 rows (before encapsulated in proc)
    --15 secs to return 6761 rows (after encapsulation)
    --13 secs to return 6761 rows (by using bigint instead of float - making ddl change on table)

    Thanks again for all of your very helpful suggestions!

    Yow!

    I feel your pain. I'll try to get a moment to look at the plan, but, I'd say in addition to the code changes I suggested, you need to focus on clustered indexes first. The plan is probably a bit of a waste of time at the moment with no indexes on the tables. It's going to be all scans.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey - Wednesday, November 22, 2017 9:19 AM

    PrettyMegaDBA - Wednesday, November 22, 2017 9:11 AM

    Ok, I hope I am doing this correctly.  I am providing the Execution Plan as .xml, and DDL of the tables and view in the queries.  There are no, indices, no PK's and no FK's.  Before you all go nuts, I've already smacked the hands.  They know it's crap, I know it's crap, we know it's crap. But as I mentioned, I've inherited it this way and unfortunately, it's in production.   But here is some additional background info.

    1. The data is loaded into a db yearly from a .csv provided from the gooberment (yes I spelled it that way, because their goobers 😉
    2. I already know it is poorly designed and have addressed those issues with the team.
    3. The db will be redesigned and the beginning of the year.
    4. In the meantime, the current db is in production, not very large and doesn't have too many users.

    Here are my number of rows returned and timing:

    --18 secs to return 6761 rows (before encapsulated in proc)
    --15 secs to return 6761 rows (after encapsulation)
    --13 secs to return 6761 rows (by using bigint instead of float - making ddl change on table)

    Thanks again for all of your very helpful suggestions!

    Yow!

    I feel your pain. I'll try to get a moment to look at the plan, but, I'd say in addition to the code changes I suggested, you need to focus on clustered indexes first. The plan is probably a bit of a waste of time at the moment with no indexes on the tables. It's going to be all scans.

    Thanks so much!

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply