Tuning expensive query II

  • Hi All,

    I need to tune this query :

    ORIGINAL QUERY ( Duration : 1 sec , logical read : 2017 ) as below:

    SELECT DISTINCT "PT"."Z" AS "Created Date",

    "PT"."A" AS "PT_Id",

    "PT"."text" AS "Text"

    FROM

    (

    SELECT "PT_id",

    "Z",

    "T",

    "A",

    "Y",

    "text_id",

    "code",

    "S",

    "P",

    CAST( "text" AS VARCHAR( 2000 ) ) AS "text"

    FROM "DB1"."dbo"."Table A"

    WHERE "code" = 'ABCD'

    )

    "PT"

    I modified the script become :

    SELECT DISTINCT PT."Z" AS "Created Date",

    PT."A" AS "PT_Id",

    CAST( "text" AS VARCHAR( 2000 ) ) AS "text"

    FROM "DB1"."dbo"."Table A" PT

    WHERE "code" = 'ABCD'

    Result : NO different 

    -------------------------------------------------------------------------------------------------------------------------------

    I added Nonclustered Index as below :

    CREATE NONCLUSTERED INDEX [IX_PT_code] ON [dbo].[Table A]

    (

    ASC

    )

    INCLUDE ( [Z], [A]

    ) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    Result : No different either

    FYI : Table A consist of 197075 rows and this query returns (45108 row(s) . In actual execution plan , it uses Clustered Index scan ( 34%) and sort (66%)

    Any idea what I should do to optimize this query ?

    cheers

  • http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • If you're returning 45,000 rows out of a 197,000 row table, That's approximately 25% of all the data in the table. The fastest way for it to perform that will be to scan the clustered index. The only way to speed this up will be to buy faster disks. Indexing won't change things. However, removing the DISTINCT keyword could help. That's going to be adding aggregation to the query making it even worse.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Thanks for the link . NOTED 🙂

    I attach the execution plan .

    Project text --> Table A

    Below query is clearer I think ( The query that I run ) :

    SELECT distinct PT."created_dttm" AS "Created_Datetime",

    PT."proj_task_id" AS "Project_Task_Id",

    CAST( "text" AS VARCHAR( 2000 ) ) AS "text"

    FROM "DB1"."dbo"."Table A" PT

    WHERE "code" = 'ABCD'

    I don’t think eliminate distinct will make any difference ….

  • WhiteLotus (5/26/2015)


    Thanks for the link . NOTED 🙂

    I attach the execution plan .

    Project text --> Table A

    Below query is clearer I think ( The query that I run ) :

    SELECT distinct PT."created_dttm" AS "Created_Datetime",

    PT."proj_task_id" AS "Project_Task_Id",

    CAST( "text" AS VARCHAR( 2000 ) ) AS "text"

    FROM "DB1"."dbo"."Table A" PT

    WHERE "code" = 'ABCD'

    I don’t think eliminate distinct will make any difference ….

    That's just a picture and tells us nothing. You need to right click on the actual execution plan, save it as a file, and attach that.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (5/26/2015)


    WhiteLotus (5/26/2015)


    Thanks for the link . NOTED 🙂

  • Thanks for that. Unfortunately, I'm working from a 2008 box tonight and I can't open 2008R2 execution plans.

    Someone will be able to help in the morning.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • WhiteLotus (5/26/2015)


    I don’t think eliminate distinct will make any difference ….

    By which you mean you've tested it and it didn't make any difference? ....

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (5/27/2015)


    WhiteLotus (5/26/2015)


    I don’t think eliminate distinct will make any difference ….

    By which you mean you've tested it and it didn't make any difference? ....

    Despite the fact that the DISTINCT SORT operator takes 66% of the cost of the query.

    Looking at the plan, it reflects everything said so far. You're returning 45,000 rows out of 197,000. That's 22% of the total. That's going to be a scan no matter what you do. You have the CAST that's occurring. While that's not free I doubt it's seriously impacting the system since it's doing nothing else. Then you have the DISTINCT SORT operator that, per the query optimizers estimates, is 66% of the cost. That's more than the index scan itself. I'll bet you serious money it makes a difference in performance.

    Your statistics are good since your estimates and actuals match perfectly. Except for removing the DISTINCT clause, there's nothing you can do with this query as long as it's returning such a large percentage of the data.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (5/27/2015)


    GilaMonster (5/27/2015)


    WhiteLotus (5/26/2015)


    I don’t think eliminate distinct will make any difference ….

    By which you mean you've tested it and it didn't make any difference? ....

    Despite the fact that the DISTINCT SORT operator takes 66% of the cost of the query.

    Looking at the plan, it reflects everything said so far. You're returning 45,000 rows out of 197,000. That's 22% of the total. That's going to be a scan no matter what you do. You have the CAST that's occurring. While that's not free I doubt it's seriously impacting the system since it's doing nothing else. Then you have the DISTINCT SORT operator that, per the query optimizers estimates, is 66% of the cost. That's more than the index scan itself. I'll bet you serious money it makes a difference in performance.

    Your statistics are good since your estimates and actuals match perfectly. Except for removing the DISTINCT clause, there's nothing you can do with this query as long as it's returning such a large percentage of the data.

    I just tested the query by eliminating DISTINCT ..

    The same result like before

  • The execution time is the same? I'm not surprised the reads are the same. You're still scanning 45,000 rows.

    There just isn't anything to tune in the query here.

    Wild shot in the dark, maybe if you set up a filtered, nonclustered index on the same key column as your cluster, filtered for 'RESO' and use the three columns from the output as INCLUDE. You'll still get an index scan, but it might be a smaller number of reads since you'll possibly have more data per page (maybe, can't see the rest of your structure). I'm not sure that will help at all.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (5/27/2015)


    The execution time is the same? I'm not surprised the reads are the same. You're still scanning 45,000 rows.

    There just isn't anything to tune in the query here.

    Wild shot in the dark, maybe if you set up a filtered, nonclustered index on the same key column as your cluster, filtered for 'RESO' and use the three columns from the output as INCLUDE. You'll still get an index scan, but it might be a smaller number of reads since you'll possibly have more data per page (maybe, can't see the rest of your structure). I'm not sure that will help at all.

  • First question, did the execution time change with the removal of DISTINCT with the original indexing scheme?

    Next question, did either of those indexes get used? Did you check the execution plan to see if they were picked up by the optimizer or did it stick with the original index. If it didn't choose those, we might want to try an index hint here. But, this part, it's desperation to squeeze a little bit of performance out. It won't make a major difference. The real problem is the volume of data we're attempting to move. There's no way, within a query or index, to simply make that "fast." Moving that much data is largely dependent on hardware.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • WhiteLotus (5/27/2015)


    I just tested the query by eliminating DISTINCT ..

    the result is :

    (45108 row(s) affected)

    Table 'project_text'. Scan count 1, logical reads 2017, physical reads 0, read-ahead reads 0, lob logical reads 62693, lob physical reads 0, lob read-ahead reads 0.

    The same result like before

    And the comparative output of STATISTICS time for the queries with and without DISTINCT? Removing it's not going to change the logical reads, the same amount of data still has to be read from the table.

    btw, why are you tuning a query that runs under 1 second and does ~2000 reads?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Another point, if you are trying to measure the execution time, make darned sure you turn off the STATISTICS IO when you do. They can radically skew[/url] the results of execution time.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply