functions

  • Hello,

    We have a select into query that has 5 functions called numerous times. The functions are merely different ways of calculating values such as extended cost, very simple stuff.

    For the past several years, this query has executed in approximately 20 minutes against 3 million rows. Now it is taking 11+ hours. As far as we can tell, nothing has changed. If we remove the functions from the query, it runs in under a minute. Add one function back and the run time increases dramatically.

    Also, When we move the data and code to a development server (much smaller), it runs in 20 minutes. The execution plans are identical between production and development. At this point, we are not sure where to look or how to further troubleshoot.

    We are running SQL2000 Ent. Ed. SP3a AWE enabled, 32GB ram on Windows 2003 Enterprise Ed.

    Has anyone experienced a problem similar to this one? Any help would be appreciated.

  • hey,

    Well, haven't experienced this type of behavior, but very interesting and would like to help.

    First question, how much data is being processed now? How often to you update your indexes?

    Thanks,

    Phillip Cox

  • Thanks for the reply,

    The source table has 3 million rows. We perform a select into a temporary table and calculate various cost columns using functions. Functions are used beacuse many columns have the came calculation, just different column values.

    Indexes are rebuilt weekly.

    The problem does not appear to be in the joins. If we remove the functions, it runs in under a minute. We are not CPU bound either (4 dual core processors).

  • Are these multi-line or inline? scalar? are they deterministic (and flagged as such)?

    Unfortunately, user defined functions in SQL Server aren't known for their performance. There are lots of instances where they force scans or row by row processing, essentially causing a perf nosedive...

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • They are simple calculations, nothing complex. For example, we pass a couple of parms like cost, qty, and unit of measure then return the extended cost as money.

    In some cases, we may read a lookup table (10 rows) to determine a value based on a code.

    Hope this helps.

  • Can you post code?

    There are a number of constructs within functions that look good and perform terribly. As an example, if you have a scalar udf (user-defined function) that reads 10 rows and you use that function in the select clause of a query that returns 10 000 rows, then the function runs 10000 times (in most cases) and those 10 rows that it reads are each read 10000 times.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Sorry, can't post the code but I will try to explain further using an example.

    Select ...

    column1,

    column2,

    column3,

    dbo.fnFunction (a.Cost1, b.COST2, b.COST3, b.XDate,@CurrentDt) as Cost4,

    column5

    .

    .

    .

    into #temp from tblsource1 a (3 million rows) left join tblsource2 b (700,000 rows) on a.key=b.key

    CREATE function fnFunction

    (

    @Cost1 money,

    @Cost2 money,

    @Cost3 money,

    @XDate datetime,

    @Today datetime

    )

    Returns money

    as

    begin

    return

    isnull(case

    when @Cost1 > 0 then @Cost1

    else

    case when @XDate > @Today then @Cost2

    else

    @Cost3

    end

    end,0.00)

    end

  • If the core of the function involves no data access, you might care to try turning it into a CLR function (deterministic and precise if applicable). Since CLR functions are compiled, scalar CLR functions will tend to execute faster than the interpreted T-SQL ones, assuming they don't get slowed down by things like data access. Especially on things that CLR is "good" at and T-SQL is only somewhat good at (like string manipulation/pattern-matching, formatting, some type of validations etc...)

    Still - it's not at all unusual to see that simply incorporating the "guts" of the function into the outer select is faster than calling the function. That approach might even be faster than using the CLR version of the function (and won't have the resource issues you sometimes run into with CLR objects in SQL Server).

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • In some cases, we may read a lookup table (10 rows) to determine a value based on a code.

    This is your problem.

    Replace those functions with views and join your tables to them.

    _____________
    Code for TallyGenerator

  • Thanks for all the replies. They are very much appreciated!

    What continues to be a mystery is why a procedure that has been running since 2005 for 20 minutes would increase to 11+ hours. (It appears to me that the optimizer is forcing a cursor operation.)

    Even more confusing, is that the extended run time appeared on our development system whithout warning yesterday. Previously, the process was running 20 minutes, now 11+ hours.

    What would cause this kind of behavior?

  • Thomas Martin (9/12/2008)


    What would cause this kind of behavior?

    Has anything changed? Any index creates/drops? Any database settings changed?

    It could be data volumes. The amount of data's increased and now the optimiser's picked a different plan. It could be index fragmentation. It could be inaccurate statistics

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Have you rebooted lately? I've seen plans just go out the window some times. Have you checked a plan on the development system?

  • Thanks again for all the help.

    Yes, we rebooted the server, rebuilt all of the indexes, and resampled the statistics. When we compared the execution plans between systems, they were identical. The data size is approximately the same, 3 million rows (since 2005). Now we are looking at configuration options.

    When we remove all the functions the process runs is seconds, then as soon as we add just one back, we have extended run time.

  • What continues to be a mystery is why a procedure that has been running since 2005 for 20 minutes would increase to 11+ hours. (It appears to me that the optimizer is forcing a cursor operation.)

    Enter "tipping point" in the search box up there and read the articles and posts it will return you.

    It should clear your mystery.

    Once again - data accessing function in SELECT list is a cursor.

    Before some stage there was probably enough memory to hold all results of the functions in memory and reuse them for following rows.

    Now you have either more sets of parameters the before or less memory because of more users, other applications launched on the same server, etc.

    Anyway, there is no enough memory for all data accessing functions results, and server has to run all "hidden cursors" you put into the query.

    _____________
    Code for TallyGenerator

  • To clarify what Sergiy just said... all those functions that do "lookups" are just an insideous way of doing correlated subqueries which are RBAR on sterioids. RBAR can appear to work fairly well if it all fits in memory cuz memory is fast. If it doesn't, you get the symptoms you just mentioned... everything was fine and then, one day, BOOM! Same goes when something works well on dev and then blows up in prod... just like Sergiy said, chances are you have less usable memory available on the prod machine because you have more users or processes which means it'll "tip" well before the dev box does.

    Also as Sergiy stated... the best thing to do would be to resolve the lookups in the form of a join. If you're real careful, you can do that in a view and then join to the view.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 31 total)

You must be logged in to reply to this topic. Login to reply