Select query inside procedure taking long time, clustered indexes doing scans, need suggestion

  • I have a procedure, which is running very slow, and is being timed out without returning any records. [Reports] Table is already generated by another SP, which creates indexes on this underlying table ([Reports]) and this SP just reads data from this [Reports] table.

    Approx number of records 150K and is taking more than 15 minutes to load this data.

    End user needs "Select All" for all filters i.e. they need data for all filters available on report.

    I have added current execution plan for slow procedure and underlying table structure:

    https://drive.google.com/open?id=0BxYEMI40aU_aRjZaVzJyMXJfb0k

    What could be the reason for slow performance?

    Things I have tried:

    --Local Parameters are already in place (to avoid parameter sniffing)

    --Updated statics

    --Tried to have column store index on all VARCHAR(MAX) columns

    --SET ARITHABORT ON

    --OPTION (OPTIMIZE FOR UNKNOWN)

    --Tried to create full text search column but couldn't do due to underlying table structure (no unique single non-nullable column index is available)

    But there is no improvement in performance.

    Please let me know if additional information is required to answer this question and give suggestions

    Thanks!!!

  • I'm still reading through all this code and the execution plan, but here's some initial things I see that give me some concern:

    1. I see you're using a multi-line instead of in-line table valued function in your fn_SplitString. in-line functions work much more efficiently and quicker, and it looks like it's getting called hundreds of thousands of times. I'd think you'd want to do the splitting of the values in the variables separately before this query to ensure it just happens once.

    There's an excellent article and example of splitting strings by Jeff Moden here

    http://www.sqlservercentral.com/articles/Tally+Table/72993/

    2. There's a number of places you're using NOLOCK hints, This could cause some issues in the returned data, since the query may see inconsistent or uncommitted data.

    3. The join between dbo.EOS_Reports and #TempUserAccess looks and feels very awkward with all those AND CASE conditions in the join. This seems to be one of those "catch all queries" that will be difficult to consistently get good performance out of.

    4. Your execution plan shows clustered index scan on EOS_Reports, estimated 148,104 rows and estimated data size 5585 Megabytes in the first half of the UNION. For EOS_Reports indexes, you don't seem to have an index on OrgPathID which is the primary join condition to #TempUserAccess, and given the problem in 3 above, this is why you're getting that full scan.

  • How about posting the actual code instead of an estimated execution plan? Also, it would be a LOT better to see the actual execution plan. You have a splitter in here named fn_Split, I can tell from the execution plan that your splitter needs to be thrown away for a better one. You have a multi-statment table valued function with a loop. This is the absolute worst possible way to write a function. It is actually slower than a scalar function. To rub salt in the wound it appears you call this function repeatedly.

    Share the code of your procedures and we can help minimize the number of times you need to call this. And more importantly we can help you replace that function with one that isn't so painfully slow.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • @SSCrazy- Thanks for these valuable inputs..

    fn_SplitString - This function is used to split the pipe delim. parameters being passed from multi-select dropdown boxes, and we have 5 such parameters. As of now, I'm not sure If I can change this function/behavior as this has been used across all reports/procs.

    The join between dbo.EOS_Reports and #TempUserAccess is required, and the CASE statements is using this fn_SplitString per filter criteria.

    I am not getting why clustered scan, because OrgPathID is part of Clustered Index key defined on EOS_Reports table.

    @SSCoach - I have put code in same folder, thanks for your inputs.!

  • sqlandbiuser (12/5/2016)


    ...

    I am not getting why clustered scan, because OrgPathID is part of Clustered Index key defined on EOS_Reports table

    ...

    When considering how SQL Server uses indexes, the order of the columns in the index matters. Since you said the clustered index definition is (InstanceId, OrgID, Sharedid), and OrgID is the second column, it can't effectively use OrgID in reading EOS_Reports table unless the join conditions or WHERE clause conditions also included InstanceID.

  • sqlandbiuser (12/5/2016)


    @SSCrazy- Thanks for these valuable inputs..

    fn_SplitString - This function is used to split the pipe delim. parameters being passed from multi-select dropdown boxes, and we have 5 such parameters. As of now, I'm not sure If I can change this function/behavior as this has been used across all reports/procs.

    The join between dbo.EOS_Reports and #TempUserAccess is required, and the CASE statements is using this fn_SplitString per filter criteria.

    I am not getting why clustered scan, because OrgPathID is part of Clustered Index key defined on EOS_Reports table.

    @SSCoach - I have put code in same folder, thanks for your inputs.!

    I would bet my lunch money that the biggest issue you have from a performance perspective is that splitter. But you still haven't shown us your splitter. You also haven't given us an actual execution plan. When you have performance problems we need to know the table structure (which we don't have), indexes (which we don't have), approximate row counts (which we don't have).

    Here is a great article that discusses the best options when needing help with a performance problem. http://www.sqlservercentral.com/articles/SQLServerCentral/66909/[/url]

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • @sean, The splitter definition is in the query plan file.

    CREATE FUNCTION [dbo].[fn_SplitString]

    (

    -- Add the parameters for the function here

    @List NVARCHAR(MAX),

    @Delimiter NVARCHAR(255)

    )

    RETURNS

    @SplitTable TABLE

    (

    token nvarchar(4000)

    )

    AS

    BEGIN

    -- Fill the table variable with the rows for your result set

    Declare @products nvarchar(max) = @List

    Declare @individual nvarchar(200) = null

    WHILE LEN(@products) > 0

    BEGIN

    IF PATINDEX('%'+@Delimiter+'%',@products) > 0

    BEGIN

    SET @individual = SUBSTRING(@products, 0, PATINDEX('%'+@Delimiter+'%',@products))

    insert into @SplitTable

    SELECT @individual

    SET @products = SUBSTRING(@products, LEN(@individual + @Delimiter) + 1,LEN(@products))

    END

    ELSE

    BEGIN

    SET @individual = @products

    SET @products = NULL

    insert into @SplitTable

    SELECT @individual

    END

    END

    RETURN

    @sqlandbiuser You also split each of the strings three separate times. You should consider storing your results in a temp table or table variable and use those in the queries.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • When considering how SQL Server uses indexes, the order of the columns in the index matters. Since you said the clustered index definition is (InstanceId, OrgID, Sharedid), and OrgID is the second column, it can't effectively use OrgID in reading EOS_Reports table unless the join conditions or WHERE clause conditions also included InstanceID

    ...

    This is a conditional join, InstanceID join has a different criteria on #TempTble and so on for OrgId and ShareId.. So I assumed I can't use InstanceId in other two joins (other two UNION statements). Thanks!

    But you still haven't shown us your splitter. You also haven't given us an actual execution plan. When you have performance problems we need to know the table structure (which we don't have), indexes (which we don't have), approximate row counts (which we don't have).

    I've already placed table structure along with indexes and just put the splitter definition and actual query plan.. Thanks Sean Lange!

  • sqlandbiuser (12/5/2016)I have put code in same folder, thanks for your inputs.!

    OK looking at the whole stored procedure code (I could only see part of it using the sqlplan file), I can see that the second part of the UNION does effectively use the index on EntityInstanceID *** it estimates only 102 rows, but the third part of the UNION is joining the 2 tables on EOS_Reports.SharedOrgPathID, which also does not have an index, so that explains the other full scan.

    for Sean (and anyone else watching) I did find the function definition within the sqlplan file as this, which is definitely sub-optimal:

    CREATE FUNCTION [dbo].[fn_SplitString]

    (

    -- Add the parameters for the function here

    @List NVARCHAR(MAX),

    @Delimiter NVARCHAR(255)

    )

    RETURNS

    @SplitTable TABLE

    (

    token nvarchar(4000)

    )

    AS

    BEGIN

    -- Fill the table variable with the rows for your result set

    Declare @products nvarchar(max) = @List

    Declare @individual nvarchar(200) = null

    WHILE LEN(@products) > 0

    BEGIN

    IF PATINDEX('%'+@Delimiter+'%',@products) > 0

    BEGIN

    SET @individual = SUBSTRING(@products, 0, PATINDEX('%'+@Delimiter+'%',@products))

    insert into @SplitTable

    SELECT @individual

    SET @products = SUBSTRING(@products, LEN(@individual + @Delimiter) + 1,LEN(@products))

    END

    ELSE

    BEGIN

    SET @individual = @products

    SET @products = NULL

    insert into @SplitTable

    SELECT @individual

    END

    END

    RETURN

  • Chris Harshman

    OK looking at the whole stored procedure code (I could only see part of it using the sqlplan file), I can see that the second part of the UNION does effectively use the index on EntityInstanceID *** it estimates only 102 rows, but the third part of the UNION is joining the 2 tables on EOS_Reports.SharedOrgPathID, which also does not have an index, so that explains the other full scan.

    SharedOrgPathID is also part of clustered key. 3rd Column in the index definition.

  • sqlandbiuser (12/5/2016)

    SharedOrgPathID is also part of clustered key. 3rd Column in the index definition.

    true... but each part of the UNION is evaluated by the optimizer as a separate query, then the results are "concatenated" together, and then sorted to determine distinct records since you're doing UNION instead of UNION ALL.

    I'd agree with Sean here, the first thing is to replace that splitter function you're using, and preferably only parse the strings once. If you need to get rid of the table scans, then add indexes on EOS_Reports(OrgPathID), and EOS_Reports(SharedOrgPathID). Also consider if you need to do the deduplication caused by UNION, or if you can use UNION ALL instead.

  • Chris Harshman

    I'd agree with Sean here, the first thing is to replace that splitter function you're using, and preferably only parse the strings once. If you need to get rid of the table scans, then add indexes on EOS_Reports(OrgPathID), and EOS_Reports(SharedOrgPathID). Also consider if you need to do the deduplication caused by UNION, or if you can use UNION ALL instead.

    Thanks Chris, I will see possibility of splitter function (being common for all SPs). As suggested, I've created these two indexes, and have put UNION ALL. It seems performance is improved little bit. I can see reports displaying records when selected "few" parameters instead of "ALL" from all drop downs. It's still causing issue for "Select All" in drop-downs of SSRS Report. Query itself is slow, so no issue with the SSRS Rendering part.

    I will update if I get any improvements. Thanks a lot.!

  • Procedure is still slow and taking more time for more records.. For 150K records, given select proc takes more than 10 mins to get the data.

    Any further inputs will be highly appreciated!

  • You are splitting each string 3 times, once for each UNION - exactly the same way each time. Pre-process the string splitting: take the results from the splitting subqueries and put them into temporary tables. Then replace the function calls within your WHERE clauses and rewrite them to use the temporary tables.

  • sqlandbiuser (12/6/2016)


    Procedure is still slow and taking more time for more records.. For 150K records, given select proc takes more than 10 mins to get the data.

    Any further inputs will be highly appreciated!

    Have you replaced the multi-statement TVF with an inline TVF yet, such as Jeff Moden's referenced in this article?

    http://www.sqlservercentral.com/articles/Tally+Table/72993/

    it would also help if those parameters and variables weren't VARCHAR(MAX). Is there a need to have parameters that large? There's also a type mismatch between these parameters and variables being VARCHAR with the actual table columns which are NVARCHAR, causing implicit conversion.

    Once you have a better function, you can use it like this before your main query:

    DECLARE @AgreementName_p varchar(8000) = 'value1|value2|value3|value4|value5|value6';

    CREATE TABLE #AgreementList (Item varchar(200));

    INSERT INTO #AgreementList (Item)

    SELECT Item FROM DelimitedSplit8K(@AgreementName_p,'|');

    so then you can use set based logic in your main query instead of repeatedly calling the splitter function within the query.

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply