SQLServerCentral Article

How to Make Scalar UDFs Run Faster (SQL Spackle)

,

"SQL Spackle" is a collection of short articles written based on multiple requests for similar code. These short articles are NOT meant to be complete solutions. Rather, they are meant to "fill in the cracks."

--Phil McCracken

Introduction

This will probably end up being the longest "Spackle" article I'll ever write but, except for length, it does fit the definition of being a "Spackle" article. It's not meant to be a complete solution. It's meant to fill in a couple of cracks in the knowledge of T-SQL without offering a specific solution to specific problems.

As with many of my articles, the information in this article isn't new information. Rather, it contains information that has been around longer than many people have been writing T-SQL. The reason I'm writing this article is to provide a single reference as to the performance and perceived performance problems with Scalar UDFs and a possible solution.

To commence…

I don't know about you but I think User Defined Functions are a marvel. They allow us to build and encapsulate code so that others may easily solve common "problems" in a code project easily, quickly, and consistently.

However, and almost without fail, as soon as someone suggests using a Scalar UDF (Scalar User Defined Function) on any SQL Forum, many other people will jump on the thread to remind everyone that "Scalar UDFs are slow" even if they're "memory only" functions (functions that don't access a table or use TempDB to any extent). Some folks have identified that the use of Scalar UDFs can make a query that otherwise takes only seconds to run to suddenly take minutes to run.

Is that true? Are Scalar UDFs really that bad? If so, is there a way to improve the performance of some of them?

The main purpose of this article is, of course, to answer those questions. But, there's more. We'll also find out that how people measure the performance of a Scalar UDF may be more of a problem than the UDF itself and that there are actually two different types of Scalar UDFs, one of which seems to be mostly unknown and can be extremely fast.

The Test Data

As is always true with my articles, we're going to let the code do the talking when it comes to performance. To do that, we need lots of data especially since the function we're going to test is so very simple and fast. A table with a million rows in it is what I normally test things with. The data for this article will be no exception.

If you'd like to play along as this article progresses, please run the following code to build the test data. Don't let the fact that it's going to build a million rows put you off because the following "Pseudo Cursor" code takes less than 2 seconds to build the million row test table even on my 10 year old, single 1.8GHz CPU  desktop computer.

/*********************************************************************
 Build a million row test table consisting of one column of numbers.
 This takes something less than 2 seconds on most machines.
*********************************************************************/--===== Conditionally drop the test table to make reruns
     -- in SSMS easier.
     IF OBJECT_ID('tempdb..#TestTable','U') IS NOT NULL
        DROP TABLE #TestTable
;
GO
--===== Create and populate the test table on-the-fly
 SELECT TOP 1000000
        SomeNumber = IDENTITY(INT,1,1)
   INTO #TestTable
   FROM sys.all_columns ac1
  CROSS JOIN sys.all_columns ac2
;
GO

The Scalar Function

Let's keep this real simple so that we don't confuse the performance issues with anything else. Let's create a Scalar UDF that does nothing but multiply the number we give it by 2 and return that number.

/*********************************************************************
 This is the simple Scalar function that we'll initially use to
 test performance of Scalar functions with. All it does is multiply
 the integer that it's give by two. This is a classic Scalar UDF.
*********************************************************************/ CREATE FUNCTION dbo.TimesTwo
        (@SomeINT INT)
RETURNS INT WITH SCHEMABINDING AS
  BEGIN
 RETURN @SomeINT * 2
    END
;

The code above is what most people think of when it comes to Scalar UDFs. It identifies the data-type of the return, has a "body" identified by a BEGIN/END block definition, and returns just one value. Of course, such a function can be much more complicated with variable declarations, multiple SELECT statements, and even WHILE loops, but it will still return only one value.

The Baseline Test

Before we test the function, let's first see what happens if we do a test where we multiply the numbers in the test table by two directly in the code. That'll give us a good baseline to show us just exactly how slow putting even simple calculations in a Scalar Function will be.

As a bit of a side bar, you'll notice that I throw away the results by dumping them into a variable. Why? Just one reason. I'm not trying to measure how long it takes to display or store a million rows in a table. I just want to measure the time it takes for the code to execute. Dumping the results to a variable is an easy way to take display and disk storage times out of the picture.

/*********************************************************************
 Baseline test to see how long the problem takes without a function.
*********************************************************************/--===== Declare a "throw-away" variable to take display/storage times
     -- out of the picture.
DECLARE @Result INT
;
--===== Begin measuring CPU time and duration using a common method
    SET STATISTICS TIME ON
;
--===== Test the code using the "throw-away" variable
 SELECT @Result = SomeNumber * 2
   FROM #TestTable
;
--===== Stop measuring CPU time and duration
    SET STATISTICS TIME OFF
;

Here are the results and all I can say is MAN! That's fast code. Now you know why we're testing using a million rows! Your measurements on a newer, faster machine will probably show that the code is twice as fast. (Note that all of the code was executed in SQL Server 2005. I did it all in 2005 because I wanted to use my slower machine to exemplify the difference in performance in all the tests available in this article.)

SQL Server Execution Times:
   CPU time = 797 ms,  elapsed time = 878 ms.

That's our baseline to compare all of our other runs against.

Testing the Function

Now, let's test the function against the same million row table for performance using the same test harness as we did for the "Baseline" code.

/*********************************************************************
 Test the function using the same test harness we used for the
 baseline.
*********************************************************************/--===== Declare a "throw-away" variable to take display/storage times
     -- out of the picture.
DECLARE @Result INT
;
--===== Begin measuring CPU time and duration using a common method
    SET STATISTICS TIME ON
;
--===== Test the code using the "throw-away" variable
 SELECT @Result = dbo.TimesTwo(SomeNumber)
   FROM #TestTable
;
--===== Stop measuring CPU time and duration
    SET STATISTICS TIME OFF
;

Here are the results from that run.

SQL Server Execution Times:
   CPU time = 109031 ms,  elapsed time = 299436 ms.

If we do the math and compare the statistics times between the baseline run and the run above, which uses the Scalar UDF, we find that the Scalar Function used more than 135 times the CPU and took more than 340 times longer to execute. Doing a bit more simple math, we discover that just the use of a Scalar UDF took code that previously ran in sub-second times and bloated it out to almost 5 minutes! The rumors are true. Scalar UDFs aren't just a performance problem, they're a horrible performance problem and should probably be banned from all code! Right?

Well… maybe not. As with all else in SQL Server, "It Depends".

"If you measure it, you change it."

Any good Scientist will tell you, "If you measure something, you change it". One of the primary goals of every Scientist is to affect what is being measured as little as possible.

One day, quite a while back, I was rebuilding a UDF to solve the infamous "Initial Caps" problem. Being one of those that had "learned" the lesson that Scalar UDFs are a Pandora's box insofar as performance goes, I built a Tally Table based inline function (more on inline functions later) to solve the problem. It ran quite fast according to the elapsed time indicator at the bottom right corner of the SSMS window. It wasn't "0" seconds, but it was fast.  Or, at least I thought so.

I also began testing someone else's attempt to solve the same "initial Caps" problem for comparison purposes. They had not only used a Scalar UDF in the classic sense, but it also used a WHILE loop. Even though it was a "memory only" solution, I expected it to be dog slow especially because of the WHILE loop and was all set for the "easy kill". Surprisingly, it returned in much less time than the inline function and produced the same correct result set! Gasping for air and trying desperately to not let the vein popping out of my forehead explode because my fine "set based" code had just been pummeled into the ground by a wanton RBAR1 method, I wanted to see just exactly how fast it was. I dropped the code into my normal test harness, which uses SET STATISTICS TIME ON, just like I had thousands of other times with non-UDF based code.

The results from that weren't at all what I expected, though. The Scalar UDF now took more than 5 minutes to run against the same rows of test data as it had before!

What went wrong? After carefully double checking the code I had copied and running the code over and over in utter disbelief, I went back to what I had originally done to try to figure out what I had done wrong. I started from "the beginning" and ran just the code without SET STATISTICS because that was the only thing I had changed. Again and to my continued utter amazement, the Scalar UDF returned to blazing speeds. I tried the same thing with the inline function. There was virtually no difference in duration with an inline function but SET STATISTICS made huge difference with the Scalar Function.

Starting to realize the dark truth, I changed the method I was using to measure duration. Instead of using SET STATISTICS, I "started a timer" using GETDATE() at the start of the code and then used GETDATE() again at the end to calculate the duration. The duration returned to being lightning fast.

So what IS the dark truth? You might think that it was just the revelation that SET STATISTICS TIME ON had a profound affect on Scalar UDFs because of their "Hidden RBAR"2 nature but that wasn't the whole of it. The real revelation was that a whole lot of people, including myself, may have giving Scalar UDFs a really bad reputation that they might not actually deserve because we were testing them incorrectly!  Our tests were actually changing the results by a significant amount!

Let's see the "GETDATE() method" of measuring duration in action with the current function. Here's the code to test with.

/*********************************************************************
 Test the function for duration without using SET STATISTICS TIME ON.
*********************************************************************/--===== Declare a "throw-away" variable to take display/storage times
     -- out of the picture.
DECLARE @Result INT
;
--===== Begin measuring duration using GETDATE
DECLARE @StartTime DATETIME;
 SELECT @StartTime = GETDATE()
;
--===== Test the code using the "throw-away" variable
 SELECT @Result = dbo.TimesTwo(SomeNumber)
   FROM #TestTable
;
--===== "Stop the timer" and report the duration
  PRINT DATEDIFF(ms,@StartTime,GETDATE())
;

Here are the "OMG" results.

7456

Instead of the Scalar Function taking nearly 5 minutes to execute, it actually only took 7.456 seconds to execute! Although the Scalar UDF did impart a substantial amount of overhead (runs almost 8 times slower than the baseline, in fact), it's not the horrendous problem that our initial performance testing supposedly "proved".

The lesson learned here is that "If you measure it, you change it". Be careful how you measure such things and use the correct method based on what you're doing. Whatever you do, don't use SET STATISTICS TIME ON to measure the performance of a Scalar UDF because it'll make it look a whole lot worse than it actually is.

You might actually say that one way to make a Scalar UDF run faster is to simply measure its performance using the correct tools. 😉

Of course, as fast as 7.456 seconds may seem to process a million rows, it's still almost 8 times slower than the baseline code that solved the problem directly. Let's see if we can do something about THAT!

A Different Type of Scalar UDF

I actually owe this wonderful bit of "Spackle" of knowledge to Paul White.  As my boss says, it's a "Spec-hack-ular" idea.

I've forgotten how long ago it was but I was solving a problem on one of the threads on the forums here at SQLServerCentral.com. Because I was necessarily returning a Scalar Value, I used a Scalar UDF in the classic sense. Paul White was one of the people who jumped in on the "Scalar UDFs are Slow" bandwagon but his post was different than all of the others. He suggested using an "iTVF" (inline Table Valued Function) to return the single value instead of using a classic Scalar UDF.

If you lookup "Scalar Function" in Books Online, you'll find that there are actually two types of user defined Scalar Functions.

User-defined scalar functions return a single data value of the type defined in the RETURNS clause. For an inline scalar function, there is no function body; the scalar value is the result of a single statement.

My jaw dropped when I first read that. There's currently no such thing as an "inline scalar function" in T-SQL. There are Scalar Functions ("slow" for short), Multi-line Table Valued Functions (mTVF for short), and Inline Table Valued Functions (iTVF for short). It says so right in Books Online in the "CREATE FUNCTION" reference.

Could it be that we could use an iTVF as a "inline scalar function"??? Remembering that Paul had written a wonderful pair of articles on the use of CROSS APPLY and how to use it with table valued functions in general, I setout to try to figure out what Paul had suggested to me.

To make a much longer story shorter, I converted the code in the Scalar UDF to an "all in one query" that would work in a single value Table Valued Function and it worked better than I could have ever hoped. I've been using the method to replace classic Scalar UDFs wherever I can.

There is one huge restriction, of course. The "inline scalar functions" (what I now call "iSF" ) must be written as an "all in one query". Considering how simple most functions are, that's usually not a problem but you do have to be aware of the requirement  Of course, after having previously read Paul's articles on CROSS APPLY, that task has become a whole lot easier, as well.

Let's rewrite our current "TimesTwo" function as an iSF and see what happens.

--===== Drop the old function
DROP FUNCTION dbo.TimesTwo;
GO
/*********************************************************************
 This function multiplies the integer that it's give by two just like
 the previous version of the function demonstrated so far. The
 difference is that it's been written as an iTVF (inline Table Valued
 Function).  Since it only returns a single value, it becomes what
 Books Online referred to as an "inline Scalar Function" or "iSF" for
 short. It's been written as a single query and the body is no longer
 defined by a BEGIN/END block.  The datatype  being return is whatever
 the datatype of the result is, which could be changed by CAST or
 CONVERT.
*********************************************************************/CREATE FUNCTION dbo.TimesTwo
        (@SomeINT INT)
RETURNS TABLE WITH SCHEMABINDING AS
 RETURN SELECT Doubled = @SomeINT * 2
;
GO

And, now, let's test it.

/*********************************************************************
 Test the function for duration without using SET STATISTICS TIME ON.
*********************************************************************/--===== Declare a "throw-away" variable to take display/storage times
     -- out of the picture.
DECLARE @Result INT
;
--===== Begin measuring duration using GETDATE
DECLARE @StartTime DATETIME;
 SELECT @StartTime = GETDATE()
;
--===== Test the code using the "throw-away" variable
 SELECT @Result = ca.Doubled
   FROM #TestTable
  CROSS APPLY dbo.TimesTwo(SomeNumber) ca
;
--===== "Stop the timer" and report the duration
  PRINT DATEDIFF(ms,@StartTime,GETDATE())
;

As you can see, there's another drawback to converting Scalar UDFs to iSFs. You have to change the way you use the function. Because it's really an iTVF in the background, you can't just replace it in the SELECT list. Instead, you have to treat it as and use it as if it were a table.

Is it worth it? Like I said before, "It Depends". Let me help you make up your mind, though. Here's how long it took to run in milli-seconds.

863

In this case, that's actually a smidgen faster than the baseline code and it's not just a kludge of the moment. I've run the same tests on my machine dozens of times in preparation for this article and the iSF code always beats the baseline code.

But don't get too excited yet.  We haven't actually invented some form of DBCC TIMEWARP here.  Let's rerun the baseline code using the GETDATE() method of measuring the duration instead of SET STATISTICS TIME ON and see what happens.

/*********************************************************************
 Test the baseline again except without SET STASTICS TIME ON this time
*********************************************************************/--===== Declare a "throw-away" variable to take display/storage times
     -- out of the picture.
DECLARE @Result INT
;
--===== Begin measuring duration using GETDATE
DECLARE @StartTime DATETIME;
 SELECT @StartTime = GETDATE()
;
--===== Test the code using the "throw-away" variable
 SELECT @Result = SomeNumber * 2
   FROM #TestTable
;
--===== "Stop the timer" and report the duration
  PRINT DATEDIFF(ms,@StartTime,GETDATE())
;

Here are the results of that run.

776

As you can see, SET STATISTICS TIME ON even affected direct code and not by just a little.  SET STATISTICS caused the code to run more than 13% slower on my box.

An SQL Server 2000 Solution

For those that still have to work in an SQL Server 2000 environment, you can do the near equivalent of CROSS APPLY as a "Correlated Sub-Query. Like this…

/*********************************************************************
 Test the function for duration without using SET STATISTICS TIME ON
 using a Correlated Sub-Query so that it works in SQL Server 2000.
*********************************************************************/--===== Declare a "throw-away" variable to take display/storage times
     -- out of the picture.
DECLARE @Result INT
;
--===== Begin measuring duration using GETDATE
DECLARE @StartTime DATETIME;
 SELECT @StartTime = GETDATE()
;
--===== Test the code using the "throw-away" variable
 SELECT @Result = (SELECT Doubled FROM dbo.TimesTwo(SomeNumber))
   FROM #TestTable
;
--===== "Stop the timer" and report the duration
  PRINT DATEDIFF(ms,@StartTime,GETDATE())
;

That will give you the same amazing speed that CROSS APPLY of the inline function gave. In fact, on a 2005 box, it'll give even better performance than using CROSS APPLY for this example. Not by much, but it's still repeatably better. It's just butt ugly to use it that way and so most people won't use it.

Now, before you set about to converting all of your Scalar UDFs to iSFs with whichever method strikes your fancy, you need to know something very important…

IT DEPENDS!

Neither method is a panacea. Not every conversion from a Scalar UDF to an iSF is going to increase performance. In fact, the conversion can and does sometimes make for slower code. One of the reasons is because writing code "all in one query" isn't always better for performance. You have to do what you should always do when making any such conversion for performance reasons. TEST, TEST, TEST!

(Note: I had previously posted some code in this spot as an example of a Scalar UDF beating an iSF but I had made a mistake in testing which erroneously showed it to be faster.  My apologies for that mistake and a hearty "Well Done!" to Usman Butt for not only catching the problem but for posting the proper replacement iSF equivalent.)

Epilogue

Scalar UDFs are generally accepted as one of the worst things you can do performance-wise in T-SQL. However, they may have earned a reputation that they simply don't deserve because a common method of measuring performance (SET STATISTICS TIME ON) injects a major performance problem of its own when used to measure the performance of Scalar UDFs.

That, notwithstanding, some Scalar UDFs really are a lot slower than running direct code but direct code means there is no encapsulation and may not be as easy, quick, or consistent to support development efforts. Even Scalar UDFs have the advantage of getting everyone to do things the same way. Of course, everyone doing all the same thing the slow way isn't an advantage at all.

In some cases and if the Scalar UDF code can be written as a single query, there can be a highly significant performance gain by converting (or writing originally) the Scalar UDF as an "Inline Scalar Function" (iSF), which is the same as writing an "Inline Table Valued Function" (iTVF) that returns only one value.

Doing so is not a performance panacea, though. Although it hasn't been proven within the confines of this article, it's a well known fact that "all in one query" solutions are frequently performance problems themselves. To be sure of any benefit in converting Scalar UDFs to iSFs, you must test and it must be with a lot of rows to really do it right.

Last and certainly not least, understand that anything you measure will be changed especially when it comes to SET STATISTICS TIME ON.  Although I'll personally continue to use it on forums to conveniently show performance differences between coding methods, I'll do so only after I've made absolutely sure that it's not injecting major but artificial performance problems itself. 

Thanks for listening folks,

"Crack Filled!"

--Jeff Moden

References

Paul White's articles on APPLY (CROSS APPLY and OUTER APPLY) may be found at the following URLs. So far as I'm concerned, they're both a "Must Read".

http://www.sqlservercentral.com/articles/APPLY/69953/

http://www.sqlservercentral.com/articles/APPLY/69954/


1RBAR is pronounced "ree-bar" and is a "Modenism" for "Row By Agonizing Row". The term has come to mean any code method that requires more resources or causes longer runs because of its row-by-row nature even if the code is actually "set based" code.

2"Hidden RBAR" is just that. It is RBAR that is hidden and not obvious. While a given snippet of code may not have any type of explicit loop present, such as but not limited to a WHILE loop, it can be slow due to the hidden looping that most all T-SQL code does in the background (even a simple SELECT loops through rows at the machine language level). "Hidden RBAR" can usually be identified by extraordinary use of resources and/or the generation of many additional "internal" rows to process a single row. Please see the following articles for examples of "Hidden RBAR" and why it can be much worse than any form of RBAR including cursors and While loops even though, technically, it may sometimes be classified as "set based" code.

"Hidden RBAR: Triangular Joins"

http://www.sqlservercentral.com/articles/T-SQL/61539/

"Hidden RBAR: Counting with Recursive CTE's"

http://www.sqlservercentral.com/articles/T-SQL/74118/

Rate

4.93 (102)

You rated this post out of 5. Change rating

Share

Share

Rate

4.93 (102)

You rated this post out of 5. Change rating