Function in Joining column degrades performance in 2008

  • Hi,

    I have an existing code in sql server 2000, which after migrating to sql server 2008 the performance has become poor.

    The query inside the code have a select statement which uses joins inside it. The joining statement has a function call on one of the joining column. The query returns the results faster in 2000 environment but it takes several minutes in 2008 to return the results.

    Query Format:

    select col1, col2, col3 .....

    from #tmp t inner join table1

    on table1.col1 = t.col1

    left join table2 on table1.col1 = table2.col2

    left join table3 on dbo.func_logic(table1.col1) = table3.col1

    where table1.col4 <> 5 -- some business logic

    Does 2008 has some kind of settings that woule enable the performance of the query faster?

    Note: When I looked at the query plan of 2008, the place where we have the function call goes for the Lazy pool.

    In the query plan of 2000, I dont see anything like that.

  • To get rid of the lazy pool you should alter the function WITH SCHEMABINDING.

    As a side note, using functions in join conditions is not a good idea at all.

    Hope this helps

    Gianluca

    -- Gianluca Sartori

  • aravind-305595 (6/29/2010)


    Hi,

    I have an existing code in sql server 2000, which after migrating to sql server 2008 the performance has become poor.

    The query inside the code have a select statement which uses joins inside it. The joining statement has a function call on one of the joining column. The query returns the results faster in 2000 environment but it takes several minutes in 2008 to return the results.

    Query Format:

    select col1, col2, col3 .....

    from #tmp t inner join table1

    on table1.col1 = t.col1

    left join table2 on table1.col1 = table2.col2

    left join table3 on dbo.func_logic(table1.col1) = table3.col1

    where table1.col4 <> 5 -- some business logic

    Does 2008 has some kind of settings that woule enable the performance of the query faster?

    Note: When I looked at the query plan of 2008, the place where we have the function call goes for the Lazy pool.

    In the query plan of 2000, I dont see anything like that.

    I agree with Gianluca. Using a SCALAR function as part of a join is a guarantee of slow code. See if you can rewrite the function to be an Inline Table Valued Function which returns more than one row at a time and join the function as if it were a table.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Is that a single statement table valued function or a multi-statement table valued function? If multi-statement, you were just lucky that it ran well in 2000. These types of functions don't have statistics and work off a basis of one row, which can result in pretty radically poor execution plans.

    However, if it's the former, you may just need to restructure the query. The optimizer is very different between 2008 and 2000. Almost every query I've worked with came out better in 2008, but I've heard of a few that didnt'.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Heh... I finally beat Grant to the punch for a change. 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (6/30/2010)


    Heh... I finally beat Grant to the punch for a change. 😀

    Oh right, Mr. Umpty-gazillion points, you "finally" beat me to the punch. :hehe:

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Hi, Thanks all. I thought to re-write that piece of code with suggestion given in this forum. Now I face another issue with the joins, my code looks like,

    update table1 set col1 = s.col1

    from table1 w inner join [SQLServer2].[database1].[dbo].[table1] s on w.col2 + ':' + w.col3 = s.col2 and w.col4 = s.col3

    where w.col1 is null

    and w.col2 is not null.

    I am using two concatenated columns the join and the joining table is from a linked server. The table1 has around 2 million records.

    The same query works in 2 minutes in SQL 2000 but it takes around 18 minutes in SQL 2008.

    I dont know why the joins degrade the performance for these kind of queries in SQL 2008. Please suggest some best idea to overcome this issue. Again if I have to re-write his code, I am not sure how many more will come with this kind of scenario in the joins.

  • You're never going to get great performance with this type of query, but it's worth checking that at least collation compatible is set to true in the properties for the linked server so it can avoid returning the entire table to the local server first.

    Are both servers 2008?

  • aravind-305595 (7/9/2010)


    Hi, Thanks all. I thought to re-write that piece of code with suggestion given in this forum. Now I face another issue with the joins, my code looks like,

    update table1 set col1 = s.col1

    from table1 w inner join [SQLServer2].[database1].[dbo].[table1] s on w.col2 + ':' + w.col3 = s.col2 and w.col4 = s.col3

    where w.col1 is null

    and w.col2 is not null.

    I am using two concatenated columns the join and the joining table is from a linked server. The table1 has around 2 million records.

    The same query works in 2 minutes in SQL 2000 but it takes around 18 minutes in SQL 2008.

    I dont know why the joins degrade the performance for these kind of queries in SQL 2008. Please suggest some best idea to overcome this issue. Again if I have to re-write his code, I am not sure how many more will come with this kind of scenario in the joins.

    The concatenation of two columns is going to kill performance. You won't be able to use indexes. Better to have a calculated column that you can index for this.

    Also, are you hopping servers? If so, you're going to get smacked with the fact that it has to bring all the data back from the second server for processing on the first one. As written it won't filter the data on the second server. Instead you would need to use OPENQUERY and pass parameters to filter the data on the second server before it retrieves it.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Yes both are 2008 servers

  • Your best bet without doing any major code changes is to try something like this:

    ALTER TABLE table1

    ADD Concatenated_Column as col2 + ':' + col3 PERSISTED

    CREATE INDEX IDX_table1_Concatenated_Column ON table1(Concatenated_Column)

    You must have the same collation between the two servers and you must have collation compatible set on the linked server properties.

    This may on it's own be enough to improve the performance without needing a rewrite, but you'll have to examine the execution plan to see if the computed column is being used (the SQL Engine will automatically use a persisted computed column if it thinks it's more efficient, even if it's not explicitly referenced in the statement). If not, you'll have to re-write the join to reference the computed column.

    Performance tuning linked server queries is always temperamental, the best thing to do would be to re-engineer the solution so that it doesn't require them (e.g. have batch processes that bulk load data from one database to another or use transactional replication etc.)

  • aravind-305595 (7/13/2010)


    Yes both are 2008 servers

    Ok... then you might want to try the "computed column" thing that I suggested and that Howard materialized with code. You can (most likely, if you keep it "deterministic") persist the column and add an index to it just as Howard did in his code.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • It may be worth checking your physical join operators as well, as I've encountered performance issues with stored procedures during migrations from 2000 to 2008...

  • kchant (7/14/2010)


    It may be worth checking your physical join operators as well, as I've encountered performance issues with stored procedures during migrations from 2000 to 2008...

    Can you elaborate a bit on that?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • kchant (7/14/2010)


    It may be worth checking your physical join operators as well, as I've encountered performance issues with stored procedures during migrations from 2000 to 2008...

    Can you elaborate a bit on that?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply