Improve reporting stored procedure execution time - tuning temporary tables?

  • Hi everyone,

    This is my first post here so please take it easy on me if there's something I'm not doing as per usual and also please know that this is my first real-life performance tuning task. The database version I'm working with is a SQL Server 2008R2.

    With all that out of the way, I've been tasked with improving the performance of a reporting stored procedure which is called by an SSRS front-end and the stored procedure currently takes about 30 seconds to run on the largest amount of data (based on filters set from the report frontend).

    This stored procedure has a breakdown of 19 queries executing in it, most of which are transforming the data from an initial (legacy) format from inside the tables into a meaningful dataset to be displayed to the business side.

    I've created a query based on a few DMV's in order to find out which are the most resource-consuming queries from the stored procedure (small snippet below) and I have found one query which takes about 10 seconds, in average, to complete.

    select

    object_name(st.objectid)[Procedure Name]

    , dense_rank() over (partition by st.objectid order by qs.last_elapsed_time desc)[rank-execution time]

    , dense_rank() over (partition by st.objectid order by qs.last_logical_reads desc)[rank-logical reads]

    , dense_rank() over (partition by st.objectid order by qs.last_worker_time desc)[rank-worker (CPU) time]

    , dense_rank() over (partition by st.objectid order by qs.last_logical_writes desc)[rank-logical write]

    ...

    from sys.dm_exec_query_stats as qs

    cross apply sys.dm_exec_sql_text (qs.sql_handle) as st

    cross apply sys.dm_exec_text_query_plan (qs.plan_handle, qs.statement_start_offset, qs.statement_end_offset) as qp

    where st.objectid in ( object_id('SuperDooperReportingProcedure') )

    , [rank-execution time]

    , [rank-logical reads]

    , [rank-worker (CPU) time]

    , [rank-logical write] desc

    Now, this query is a bit strange in the sense that the execution plan shows that shows that the bulk of the work (~80%) is done when inserting the data into the local temporary table and not when interrogating the other tables from which the source data is taken and then manipulated. (screenshot below is from SQL Sentry Plan Explorer)

    Also, in terms of row estimates, the execution plan has way off estimates for this, in the sense that there are only 4218 rows inserted into the local temporary table as opposed to the ~248k rows that the execution plan thinks its moving into the local temporary table.

    One of my first suggestions was to re-write the entire process and stored procedure as to not include the moving and transforming of the data into the reporting stored procedure and to do it nightly (real-time data is not required, only relevant data until previous day). But the business side does not want to invest time and resources into redesigning this and instead "suggests" I do performance tuning in the sense of finding where and what indexes I can add to speed this up.

    I don't believe that adding indexes to base tables will improve the performance of the report since most of the time needed for running the query is saving the data into a temporary table (which from my knowledge will hit tempdb, which means that they will be written to disk -> increased time due to I/O latency).

    But, even so, as I've mentioned this is my first performance tuning task and I've tried to read as much as possible related to this in the last couple of days and these are my conclusions so far, but I'd like to ask for advice from a broader audience and hopefully get a few more insights and understanding on what I can do to improve this procedure.

    As a few clear questions I'd appreciate if could be answered are:

    - Is there anything incorrect in what I have said above (in my understanding of the db or my assumptions) ?

    - Is it true that adding an index to a temporary table will actually increase the time of execution, since the table (and its associated index(es) is/are being rebuilt on each execution)?

    - Could there anything else be done in this scenario without having to re-write the procedure / queries and only be done via indexes or other tuning methods? (I've read a few article headlines that you could also "tune tempdb", but I didn't get into the details of those, yet).

    Any help is very much appreciated and if you need more details I'll be happy to post.

    ~ Just some guy trying to tune queries ~

  • Your questions:

    Your assumptions may or may not be right. Adding or altering the indexes on your base tables may improve performance of the report but it may also harm your insert performance in the base system. If there are 19 queries in the process, you have to look at 19 query plans.

    It depends. Sometimes the optimiser ignores indexes on temp tables, sometimes it prefers primary key constraints. As with any performance tuning, test, test, test.

    Maybe. You can move the log files and tempdb to their own separate disks.

    Some thoughts:

    1. I presume that the query plan you show is an estimated rather than actual, if you say that the row totals are wrong. So please post an actual query plan. You'll need to post the code which accompanies it otherwise people will just be guessing on how to improve it.

    2. If the process takes 30 secs for the largest set of data, what is acceptable target performance? If the part of the query you show takes 10 seconds, even a decent improvement may bring that to 2-3 secs so your overall query still takes 22-23 secs. Is that acceptable?

    3. You say that reporting up to the previous day's data is acceptable so it should be possible (if you have the space) to run an overnight process which builds a denormalised reporting table which you can index and sort correctly for your report (and set 100% fill factor). So what you are doing is moving the individual processing of calling the report to a batch process and the only performance overhead of calling the report is the 'select' performance. This should be relatively straightforward since all your code is written and tested already.

  • 1. It's the actual execution plan, not the estimated. The execution plan from above is taken from querying the DMVs mentioned in my post.

    The query that gives this execution plan is:

    select

    b.ProgramName

    ,b.Region

    ,case when b.AM IS null and b.ProgramName IS not null

    then 'Unassigned'

    else b.AM

    end as AM

    ,rtrim(ltrim(b.Store)) Store

    ,trd.Store_ID

    ,b.appliesToPeriod

    ,isnull(trd.countLeadActual,0) as Actual

    ,isnull(sum(case when b.budgetType = 0 and b.budgetMonth between @start_date and @end_date then b.budgetValue else 0 end),0) as Budget

    ,isnull(sum(case when b.budgetType = 0 and b.budgetMonth between @start_date and @end_date and (trd.considerMe = -1 or b.StoreID < 0) then b.budgetValue else 0 end),0) as CleanBudget

    ...

    into #SalvesVsBudgets

    from #StoresBudgets b

    left join #temp_report_data trd on trd.store_ID = b.StoreID and trd.newSourceID = b.ProgramID

    where (b.StoreDivision is not null or (b.StoreDivision is null and b.ProgramName = 'NewProgram'))

    group by

    b.ProgramName

    ,b.Region

    ,case when b.AM IS null and b.ProgramName IS not null

    then 'Unassigned'

    else b.AM

    end

    ,rtrim(ltrim(b.Store))

    ,trd.Store_ID

    ,b.appliesToPeriod

    ,isnull(trd.countLeadActual,0)

    I understand that adding and removing indexes can help the performance on the base tables, but those parts of the queries are already the fastest possible (or fast enough).

    2. Acceptable performance would be 10 seconds or less on the entire process. But, the other queries are similar to this one and take considerably less time (3-4 seconds on average), and it takes less time because the amount of data they save into temporary tables is less. If I can find a solution to this 10 second query I believe I can use the same fix for the others as well.

    3. This is exactly something I have recommended to the business team, but not something they want to pursue right now. So, I'm "stuck" for a while on trying to find other solutions that would not imply any code changes to the process, ideally only index tuning and other configurations / small changes (I've tested table variables instead of temporary tables).

    ~ Just some guy trying to tune queries ~

  • radu.gheorghiu (8/4/2016)


    1. It's the actual execution plan, not the estimated. The execution plan from above is taken from querying the DMVs mentioned in my post.

    ...

    That makes it an estimated plan.

    -- do this processing in the code which populates #StoresBudgets:

    case when b.AM IS null and b.ProgramName IS not null

    then 'Unassigned'

    else b.AM

    end as AM

    -- and this too

    rtrim(ltrim(b.Store))

    -- Create a clustered index on #temp_report_data on store_ID and newSourceID

    -- create a clustered index on #StoresBudgets on ProgramName, Region, AM, Store, appliesToPeriod

    -- checking first to ensure that these are significant partitions (hint - if you can get away with

    -- using MAX and omitting them from the GROUP BY list, then they are not)

    -- THEN post the actual execution plan as a .sqlplan attachment.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • The execution plan is taken from:

    , cast(qp.query_plan as xml)[Execution plan]

    from sys.dm_exec_query_stats as qs

    cross apply sys.dm_exec_sql_text (qs.sql_handle) as st

    cross apply sys.dm_exec_text_query_plan (qs.plan_handle, qs.statement_start_offset, qs.statement_end_offset) as qp

    And as per documentation, it doesn't specify that it's an estimated plan (or maybe I'm not getting this..).

    I can copy the XML generated from that column into SQL Sentry Plan Explorer and that's where the screenshot is from.

    ~ Just some guy trying to tune queries ~

  • radu.gheorghiu (8/4/2016)


    The execution plan is taken from:

    , cast(qp.query_plan as xml)[Execution plan]

    from sys.dm_exec_query_stats as qs

    cross apply sys.dm_exec_sql_text (qs.sql_handle) as st

    cross apply sys.dm_exec_text_query_plan (qs.plan_handle, qs.statement_start_offset, qs.statement_end_offset) as qp

    And as per documentation, it doesn't specify that it's an estimated plan (or maybe I'm not getting this..).

    I can copy the XML generated from that column into SQL Sentry Plan Explorer and that's where the screenshot is from.

    It's the cached plan - cached for reuse. It's not augmented with the stats from the last execution. You have to capture the "actual plan" for that. The easiest way to do it is to run the query in ssms with [include actual execution plan] switched on then save the plan as a .sqlplan file, which can then be posted here.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Ok, I will do so and post it here.

    But if I look at the cached plans and this is the one whose last execution time was during the execution time of my procedure, doesn't this mean that the procedure used this plan?

    From what I know, before executing a query SQL Server looks at the cached plans and might pick one of those for executing a query, based on statistics (parameter sniffing?). So, even if I post the actual execution plan, isn't it very likely that it's going to be the same? (if no statistics information change)

    I'm a beginner, so I might have these things messed up in my mind..

    ~ Just some guy trying to tune queries ~

  • radu.gheorghiu (8/4/2016)


    Ok, I will do so and post it here.

    But if I look at the cached plans and this is the one whose last execution time was during the execution time of my procedure, doesn't this mean that the procedure used this plan?

    From what I know, before executing a query SQL Server looks at the cached plans and might pick one of those for executing a query, based on statistics (parameter sniffing?). So, even if I post the actual execution plan, isn't it very likely that it's going to be the same? (if no statistics information change)

    I'm a beginner, so I might have these things messed up in my mind..

    The so-called "actual plan" is augmented with metrics from the execution which produced it. It will look the same as, or very similar to, the "estimated plan". Differences between the two may be significant.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM, I'm sorry but after a few discussions with my TL, I can't post the .sqlplan, due to NDA restrictions. Anyway, in the screenshots above table names have been changed from the original, but the execution path & data is real.

    Would you please guide me through some of the things you would check in this scenario? DMV's that might be helpful or other similar toos? I appreciate your replies greatly.

    ~ Just some guy trying to tune queries ~

  • SQL Sentry Plan Explorer has an obfuscation option - so you can mask actual table and column names to get around NDA's. Since you've already published the original query a few posts up this is kinda moot, and I also don't understand why your TL objects - anything we see relates to two temporary tables. Crazy. Anyway, there's always a workaround. Try the stuff I posted ^^. Without the missing plans, that's all folks will be able to do - try this, try that.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Thanks for the suggestions, actually the query posted above is a bit obfuscated, changed column names / some aliases etc. but the obfuscation from SQL Sentry worked much better (didn't know about that, thanks!).

    Anyway, I have attached the obfuscated execution plan. Thank you for your help so far!

    ~ Just some guy trying to tune queries ~

  • radu.gheorghiu (8/8/2016)


    Thanks for the suggestions, actually the query posted above is a bit obfuscated, changed column names / some aliases etc. but the obfuscation from SQL Sentry worked much better (didn't know about that, thanks!).

    Anyway, I have attached the obfuscated execution plan. Thank you for your help so far!

    -- Here's the query which has been the topic of this thread to date:

    select

    b.ProgramName

    ,b.Region

    ,case when b.AM IS null and b.ProgramName IS not null

    then 'Unassigned'

    else b.AM

    end as AM

    ,rtrim(ltrim(b.Store)) Store

    ,trd.Store_ID

    ,b.appliesToPeriod

    ,isnull(trd.countLeadActual,0) as Actual

    ,isnull(sum(case when b.budgetType = 0 and b.budgetMonth between @start_date and @end_date then b.budgetValue else 0 end),0) as Budget

    ,isnull(sum(case when b.budgetType = 0 and b.budgetMonth between @start_date and @end_date and (trd.considerMe = -1 or b.StoreID < 0) then b.budgetValue else 0 end),0) as CleanBudget

    ...

    into #SalvesVsBudgets

    from #StoresBudgets b

    left join #temp_report_data trd on trd.store_ID = b.StoreID and trd.newSourceID = b.ProgramID

    where (b.StoreDivision is not null or (b.StoreDivision is null and b.ProgramName = 'NewProgram'))

    group by

    b.ProgramName

    ,b.Region

    ,case when b.AM IS null and b.ProgramName IS not null

    then 'Unassigned'

    else b.AM

    end

    ,rtrim(ltrim(b.Store))

    ,trd.Store_ID

    ,b.appliesToPeriod

    ,isnull(trd.countLeadActual,0)

    -- Here's the query from the (estimated) execution plan you posted

    select Object1.Column1

    ,Object1.Column2

    ,Object1.Column3

    ,Object1.Column4

    ,Object1.Column5

    ,Object1.Column6 Column6

    ,Object1.Column7

    ,Object1.Column8

    ,Object1.Column9

    ,Function1(Object1.Column10,?) as Column11

    ,Function1(Function2(case when Object1.Column12 = ? and Object1.Column13 between Variable1 and Variable2 then Object1.Column14 else ? end),?) as Column15

    ,Function1(Function2(case when Object1.Column12 = ? and Object1.Column13 between Variable1 and Variable2 and (Object1.Column16=? or Object1.Column17 < ?) then Object1.Column14 else ? end),?) as Column18

    ,Function1(Object1.Column19,?) as Column20

    ,Function1(Function2(case when Object1.Column12 = ? and Object1.Column13 between Variable1 and Variable2 then Object1.Column14 else ? end),?) as Column21

    ,Function1(Function2(case when Object1.Column12 = ? and Object1.Column13 between Variable1 and Variable2 and (Object1.Column16=? or Object1.Column17 < ?) then Object1.Column14 else ? end),?) as Column22

    ,Function1(Object1.Column23,?) as Column24

    ,Function1(Function2(case when Object1.Column12 = ? and Object1.Column13 between Variable1 and Variable2 then Object1.Column14 else ? end),?) as Column25

    ,Function1(Function2(case when Object1.Column12 = ? and Object1.Column13 between Variable1 and Variable2 and (Object1.Column16=? or Object1.Column17 < ?) then Object1.Column14 else ? end),?) as Column26

    ,Function1(Object1.Column27,?) as Column28

    ,Function1(Function2(case when Object1.Column12 = ? and Object1.Column13 between Variable1 and Variable2 then Object1.Column14 else ? end),?) as Column29

    ,Function1(Function2(case when Object1.Column12 = ? and Object1.Column13 between Variable1 and Variable2 and (Object1.Column16=? or Object1.Column17 < ?) then Object1.Column14 else ? end),?) as Column30

    ,Function1(Object1.Column31,?) as Column32

    ,Function1(Object1.Column33,?) as Column34

    ,Function1(Object1.Column35,?) as Column36

    ,Function1(Object1.Column37,?) as Column38

    ,Function1(Function2(case when Object1.Column12 = ? and Object1.Column13 between Variable1 and Variable2 then Object1.Column14 else ? end),?) as Column39

    ,Function1(Function2(case when Object1.Column12 = ? and Object1.Column13 between Variable1 and Variable2 and (Object1.Column16=? or Object1.Column17 < ?) then Object1.Column14 else ? end),?) as Column40

    ,Function1(Object1.Column41,?) as Column41

    ,Function1(Object1.Column42,?) as Column42

    ,Function1(Function3(case when Object1.Column12 = ? and Object1.Column13 between Variable1 and Variable2 then Object1.Column43 else ? end),?) as Column44

    ,Function1(Function3(case when Object1.Column12 = ? and Object1.Column13 between Variable1 and Variable2 and Object1.Column16 = ? then Object1.Column43 else ? end),?) as Column45

    into Object2

    from Object3 Object1

    group by Object1.Column1

    ,Object1.Column2

    ,Object1.Column3

    ,Object1.Column4

    ,Object1.Column5

    ,Object1.Column6

    ,Object1.Column7

    ,Object1.Column8

    ,Object1.Column9

    ,Function1(Object1.Column10,?)

    ,Function1(Object1.Column19,?)

    ,Function1(Object1.Column23,?)

    ,Function1(Object1.Column27,?)

    ,Function1(Object1.Column31,?)

    ,Function1(Object1.Column33,?)

    ,Function1(Object1.Column35,?)

    ,Function1(Object1.Column37,?)

    ,Function1(Object1.Column41,?)

    ,Function1(Object1.Column42,?)

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • I'm sorry, I'm posting this issue on multiple websites, lost track a bit and forgot to mention it was a simplified version of the original query (that's why there are 3 dots before INTO #SalvesVsBudgets).

    I hope it doesn't make a huge difference for analysis / tuning..

    ~ Just some guy trying to tune queries ~

  • radu.gheorghiu (8/8/2016)


    I'm sorry, I'm posting this issue on multiple websites, lost track a bit and forgot to mention it was a simplified version of the original query (that's why there are 3 dots before INTO #SalvesVsBudgets).

    I hope it doesn't make a huge difference for analysis / tuning..

    They're completely different queries. This thread has been concerned with a query referencing two temp tables: the execution plan is for a query referencing only one.

    Decide which query you'd like folks at ssc to work on, and post the actual execution plan (not the estimated plan)

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Totally noobie mistake, I attached the wrong file. I replaced it with the correct file both in the original post and have attached it here as well.

    ~ Just some guy trying to tune queries ~

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply