Queries running slower in SQL 2008 than in SQL 2000

  • Frank Carrelli (5/14/2009)


    Jeff,

    How I determined my problem was by viewing the execution plan for the queries that were giving me the problems. In the Execution plan it displayed everything twice. The Coalesce statements were causing the query to run a second time based on a boolean type of statement within the Execution Plan. By replacing the Coalesce with ISNULL caused the boolean statement phrase to disappear from the plan and thus making the query run one time only.

    I would suggest you view your plan and tweak the queries accordingly. Try removing or rephrasing the Exists / Not IN statements and see how it affects the plan. Hope that helps.

    That's not the first time I've seen COALESCE cause problems. It's terrible for performance for concatenation using MAV's (multiple assignent variables) compared to ISNULL and will also cause incorrect results unless the concatenation takes a very specific form. ISNULL works correctly even on the forms where COALESCE does not.

    Heh... I'm going to coin a phrase... "If it's not ANSI, it'll probably work better." 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The query on witch i'm struggling is the following :

    SELECT DISTINCT Cast(Cast(Col1 as nVarchar(2))+Cast(col2 as nVarchar(3))) as nvarchar(5)) as invalidValue

    FROM Table1

    WHERE Cast(Cast(Col1 as nVarchar(2))+Cast(col2 as nVarchar(3))) as nvarchar(5))

    NOT IN

    (

    SELECT DISTINCT Cast(Cast(Col1 as nVarchar(2))+Cast(col2 as nVarchar(3))) as nvarchar(5)) as ValidValue

    FROM Table2

    )

    Alright, it is not the most optimized query but it is generated by a generic solution for an intégrity control... It worked fine on SQL 2000, but on 2008, the query optimizer underestimates the number of lines returned by the subquery and do a nested loop instead of a hash table...

    I've already tried something like this :

    SELECT DISTINCT Cast(Cast(Col1 as nVarchar(2))+Cast(col2 as nVarchar(3))+Cast(col3 as nVarchar(3))) as nvarchar(8)) as invalidValue

    FROM Table1

    WHERE NOT EXISTS (

    SELECT DISTINC Cast(Cast(Col1 as nVarchar(5))+Cast(col2 as nVarchar(3))) as nvarchar(8)) as ValidValue

    FROM Table2

    WHERE Cast(Cast(Col1 as nVarchar(5))+Cast(col2 as nVarchar(3))) as nvarchar(8)) = Cast(Cast(Table1.Col1 as nVarchar(2))+Cast(Table1.col2 as nVarchar(3))+Cast(Table1.col3 as nVarchar(3))) as nvarchar(8))

    )

    No changes...

    any idea ?

  • christophe.mulotti (5/14/2009)


    The query on witch i'm struggling is the following :

    SELECT DISTINCT Cast(Cast(Col1 as nVarchar(2))+Cast(col2 as nVarchar(3))) as nvarchar(5)) as invalidValue

    FROM Table1

    WHERE Cast(Cast(Col1 as nVarchar(2))+Cast(col2 as nVarchar(3))) as nvarchar(5))

    NOT IN

    (

    SELECT DISTINCT Cast(Cast(Col1 as nVarchar(2))+Cast(col2 as nVarchar(3))) as nvarchar(5)) as ValidValue

    FROM Table2

    )

    Alright, it is not the most optimized query but it is generated by a generic solution for an intégrity control... It worked fine on SQL 2000, but on 2008, the query optimizer underestimates the number of lines returned by the subquery and do a nested loop instead of a hash table...

    I've already tried something like this :

    SELECT DISTINCT Cast(Cast(Col1 as nVarchar(2))+Cast(col2 as nVarchar(3))+Cast(col3 as nVarchar(3))) as nvarchar(8)) as invalidValue

    FROM Table1

    WHERE NOT EXISTS (

    SELECT DISTINC Cast(Cast(Col1 as nVarchar(5))+Cast(col2 as nVarchar(3))) as nvarchar(8)) as ValidValue

    FROM Table2

    WHERE Cast(Cast(Col1 as nVarchar(5))+Cast(col2 as nVarchar(3))) as nvarchar(8)) = Cast(Cast(Table1.Col1 as nVarchar(2))+Cast(Table1.col2 as nVarchar(3))+Cast(Table1.col3 as nVarchar(3))) as nvarchar(8))

    )

    No changes...

    any idea ?

    The CASTings you do in the WHERE clause make it impossible for index usage to be optimized. Post the table definitions in the form of CREATE statements and let us take a look. Also, post any indexes there may be on the tables.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (5/4/2009)


    Frank Carrelli (5/4/2009)


    I found my issue that was causing my slow running queries. It appears the Coalesce statements...

    How serendipitous... I've recently made that same discovery with doing things like concatenation in 2k5.

    BWAA-HAA!!! So much for ANSI compliant code! 😉 I'm starting to think that "ANSI" is an abbreviation for "ANother Serious Impediment" 😛

    Yeah, it's almost as if whoever implemented the COALESCE() function didn't want us to use ANSI compliant features. Oh, wait...

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Chris,

    I agree with Jeff. You can not use any sort of castings or Coalesces within the Where Clause. You should only use actual field names, more specifically, fileds that are part of an index to drive your query. If you were to just add the columns together (WHERE Col1 + Col2....) and remove the casts you should have better results. But since you are trying to extract portions of a column and compare to portions of another column within the Where Clause leaves you few options. If this query is something that runs all the time and gets a lot of usuage you may want to look into changing your database design so the portions of each column you are trying to extract are in their own column / index.

  • hi guys,

    Actually, i solved my problem by changig the query to this one :

    SELECT DISTINCT Cast(Cast(tablre1.Col1 as nVarchar(2))+Cast(tabble1.col2 as nVarchar(3))) as nvarchar(5)) , Cast(Cast(tablre2.Col1 as nVarchar(2))+Cast(tabble2.col2 as nVarchar(3))) as nvarchar(5))

    FROM Table1 left outer join table2

    on Cast(Cast(tablre1.Col1 as nVarchar(2))+Cast(tabble1.col2 as nVarchar(3))) as nvarchar(5))

    = Cast(Cast(tablre2.Col1 as nVarchar(2))+Cast(tabble2.col2 as nVarchar(3))) as nvarchar(5))

    AND Cast(Cast(tablre2.Col1 as nVarchar(2))+Cast(tabble2.col2 as nVarchar(3))) as nvarchar(5)) IS NULL

    It works definitly better this way...

    But i'm still a bit perplex about SQL Server 2008, because it seems that NOT IN and NOT EXISTS are not the only statements that are running slower on SQL Server 2008 than SQL Server 2000 on big tables...

  • Our 2000 server was upgraded to 2008. There is a certain step in a DTS now SSIS package that updates a table, setting parents for child records, that used to run in 50 seconds in 2000. It never finished overnight in 2008 so I had to replace it with a WHILE loop which runs in 30-50 minutes now. It does have a subquery, but it's very simple. The flat file does not have any parent - child link (foreign key) so the reason for this step is to set up that relationship. Here are the details:

    - table gets truncated and re-imported from daily flat file;

    - 2.5 million+ rows;

    - 3 types of records in it, parent and two types of child records;

    - identity column TheIdNr , with index;

    - recreating index manually after import doesn't improve the performance;

    - UPDATE C SET TheParent = (SELECT MAX(TheIdNr) FROM myStagingTable P WHERE P.TheIdNr < C.TheIdNr AND P.RecordType ='Parent' )

    FROM myStagingTable C WHERE C.RecordType in ('Child1', 'Child2)

    Is this "normal" in 2008 SQL Server?

  • RazvanS (6/15/2014)


    Our 2000 server was upgraded to 2008. There is a certain step in a DTS now SSIS package that updates a table, setting parents for child records, that used to run in 50 seconds in 2000. It never finished overnight in 2008 so I had to replace it with a WHILE loop which runs in 30-50 minutes now. It does have a subquery, but it's very simple. The flat file does not have any parent - child link (foreign key) so the reason for this step is to set up that relationship. Here are the details:

    - table gets truncated and re-imported from daily flat file;

    - 2.5 million+ rows;

    - 3 types of records in it, parent and two types of child records;

    - identity column TheIdNr , with index;

    - recreating index manually after import doesn't improve the performance;

    - UPDATE C SET TheParent = (SELECT MAX(TheIdNr) FROM myStagingTable P WHERE P.TheIdNr < C.TheIdNr AND P.RecordType ='Parent' )

    FROM myStagingTable C WHERE C.RecordType in ('Child1', 'Child2)

    Is this "normal" in 2008 SQL Server?

    What you have there is what is known as a "Triangular Join" and, if you got it to run as fast as you say you did in SQL Server 2000, you got very, very lucky with the indexes being just right and several other things. Please see the following article on what "Triangular Joins" are and why they're so bad for performance.

    [font="Arial Black"]http://www.sqlservercentral.com/articles/T-SQL/61539/[/font][/url]

    So, what's the alternative? "It Depends". In order to be able to help you out, we need more information about the table and some readily consumable data. Please don't assume that you know how to do that. Please use the methods outlined in the article at the first link under "Helpful Links" in my signature line below. Thanks.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thank you Jeff, that is a good article. A few more considerations:

    - adding with (nolock) in the select max subquery halves the time in the while loop approach here from 40 to 20 minutes;

    - creating another staging table with just the parents in it (about 180K rows), having an index on the integer id field, and using that one instead of the same table in the subquery, solves completely the issue, it runs now without loop in 20 seconds, updating the 2.4+ million child records:

    ...

    Insert Into dbo.ParentStage

    SELECT TheIdNr From dbo.myStagingTable where RecordType ='Parent' order by 1

    UPDATE M

    set M.TheParent =(Select max(TheIdNr ) From dbo.ParentStage where TheIdNr< M.TheIdNr )

    FROM dbo.myStagingTable M

    where M.RecordType !='Parent'

Viewing 9 posts - 16 through 23 (of 23 total)

You must be logged in to reply to this topic. Login to reply