Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Scalar UDF Performance Issue Expand / Collapse
Author
Message
Posted Wednesday, June 17, 2009 3:03 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, June 10, 2010 10:22 AM
Points: 3, Visits: 41
I'm trying to tidy up a SPROC by wrapping a CASE statement up in a function and I'm geting a huge performance hit. If I leave the CASE statement in the base script it runs in a second, I move it to a function and it takes over a minute. Below is how I'm handling the case portion inline and using the function as well as the function itself. What am I missing?

TIA

INLINE SNIPPET:
		
"StaffType" = CASE
WHEN PER_ORG = 'CWR' AND EMPL_CLASS = 'C' AND REG_TEMP = 'R' THEN 'Contractor'
WHEN PER_ORG = 'CWR' AND EMPL_CLASS = 'C' AND REG_TEMP = 'T' THEN 'Contractor'
WHEN PER_ORG = 'CWR' AND EMPL_CLASS = 'G' AND REG_TEMP = 'T' THEN 'Contractor'
WHEN PER_ORG = 'CWR' AND EMPL_CLASS = 'T' AND REG_TEMP = 'T' THEN 'Contractor'
WHEN PER_ORG = 'CWR' AND EMPL_CLASS = 'V' AND REG_TEMP = 'T' THEN 'Vendor'
WHEN PER_ORG = 'EMP' AND EMPL_CLASS = '' AND REG_TEMP = 'R' THEN 'Employee'
WHEN PER_ORG = 'EMP' AND EMPL_CLASS = '' AND REG_TEMP = 'T' THEN 'Employee'
WHEN PER_ORG = 'EMP' AND EMPL_CLASS = 'C' AND REG_TEMP = 'R' THEN 'Employee'
WHEN PER_ORG = 'EMP' AND EMPL_CLASS = 'C' AND REG_TEMP = 'T' THEN 'Employee'
WHEN PER_ORG = 'EMP' AND EMPL_CLASS = 'G' AND REG_TEMP = 'T' THEN 'Employee'
WHEN PER_ORG = 'EMP' AND EMPL_CLASS = 'I' AND REG_TEMP = 'T' THEN 'Employee'
WHEN PER_ORG = 'EMP' AND EMPL_CLASS = 'T' AND REG_TEMP = 'T' THEN 'Employee'
ELSE 'ERROR'
END


FUNCTION CALL VERSION SNIPPET:

, "StaffType" = (SELECT dbo.fn_StaffTypeMap(p.EMPLID))

FUNCTION:
USE [HRData]
GO
/****** Object: UserDefinedFunction [dbo].[fn_StaffTypeMap] Script Date: 06/17/2009 13:00:31 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

/*

AUTHOR: Bill King
CREATE DATE: 6/16/2009
DESCRIPTION: Used to convert the PER_ORG/Empl_Class/Reg_Temp combos to
a more user friendly format.

COMMENTS:


SAMPLE EXECUTION: SELECT dbo.fn_StaffTypeMap ('16011')

*/

CREATE FUNCTION [dbo].[fn_StaffTypeMap]
(
@EmplID CHAR(5)
)

RETURNS VARCHAR(20)
AS
BEGIN

DECLARE @MappedValue AS VARCHAR(20)
SELECT @MappedValue = (
SELECT CASE
WHEN PER_ORG = 'CWR' AND EMPL_CLASS = 'C' AND REG_TEMP = 'R' THEN 'Contractor'
WHEN PER_ORG = 'CWR' AND EMPL_CLASS = 'C' AND REG_TEMP = 'T' THEN 'Contractor'
WHEN PER_ORG = 'CWR' AND EMPL_CLASS = 'G' AND REG_TEMP = 'T' THEN 'Contractor'
WHEN PER_ORG = 'CWR' AND EMPL_CLASS = 'T' AND REG_TEMP = 'T' THEN 'Contractor'
WHEN PER_ORG = 'CWR' AND EMPL_CLASS = 'V' AND REG_TEMP = 'T' THEN 'Vendor'
WHEN PER_ORG = 'EMP' AND EMPL_CLASS = '' AND REG_TEMP = 'R' THEN 'Employee'
WHEN PER_ORG = 'EMP' AND EMPL_CLASS = '' AND REG_TEMP = 'T' THEN 'Employee'
WHEN PER_ORG = 'EMP' AND EMPL_CLASS = 'C' AND REG_TEMP = 'R' THEN 'Employee'
WHEN PER_ORG = 'EMP' AND EMPL_CLASS = 'C' AND REG_TEMP = 'T' THEN 'Employee'
WHEN PER_ORG = 'EMP' AND EMPL_CLASS = 'G' AND REG_TEMP = 'T' THEN 'Employee'
WHEN PER_ORG = 'EMP' AND EMPL_CLASS = 'I' AND REG_TEMP = 'T' THEN 'Employee'
WHEN PER_ORG = 'EMP' AND EMPL_CLASS = 'T' AND REG_TEMP = 'T' THEN 'Employee'
ELSE 'ERROR'
END
FROM PSPerson
WHERE EmplID = @EmplID

)
RETURN (@MappedValue)
END









Post #737008
Posted Wednesday, June 17, 2009 4:13 PM


SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Thursday, May 10, 2012 9:49 AM
Points: 800, Visits: 1,759
Problem with scalar functions used this way - where you pass a column as a parameter - is that they are not inlined by the query compiler. This means they are executed for each row and this is the main performance issue.
There is a trick you can do - instead of scalar function you can write a table valued function that returns 1 row. Query optimizer is then wise enough to choose different execution plan which allows to avoid the performance hit.
So the function would look like this:
 
CREATE FUNCTION [dbo].[fn_StaffTypeMap]
(
@EmplID CHAR(5)
)

RETURNS TABLE
AS
RETURN SELECT convert(varchar(20), [your CASE statement]) as res;
GO



HTH

Piotr



...and your only reply is slàinte mhath
Post #737054
Posted Wednesday, June 17, 2009 7:22 PM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Monday, May 12, 2014 12:57 PM
Points: 2,281, Visits: 4,225
Piotr is quite correct that the function is executed for each row and the start-up and shutdown cost of each function execution is very expensive including such processing as memory allocation and deallocation.

If the objective is not to hardcode the mapping in all of the SQL, a better solution is to have an auxiliary table:
CREATE TABLE StaffTypeMap
( PER_ORG
, EMPL_CLASS
, REG_TEMP
, StaffType
, PRIMARY KEY ( PER_ORG , EMPL_CLASS, REG_TEMP )
)

THEN your SQL is on the pattern of:
SELECT COALESCE(StaffTypeMap.StaffType, 'UNKNOWN')
FROM X
LEFT OUTER JOIN
StaffTypeMap
ON StaffTypeMap.PER_ORG = X.PER_ORG
AND StaffTypeMap.EMPL_CLASS = X.EMPL_CLASS
AND StaffTypeMap.REG_TEMP = X.REG_TEMP



SQL = Scarcely Qualifies as a Language
Post #737096
Posted Wednesday, June 17, 2009 10:36 PM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Friday, October 17, 2014 8:13 AM
Points: 9,926, Visits: 11,188
Yet other options include:

1. Creating a view (possibly indexed) over PSPerson and adding a column which contains the CASE.
2. Adding a computed column for the CASE to the PSPerson table, and preferably persisting it.

If you wish, you can encapsulate the CASE (without the data access) in a function and use that in the persisted computed column definition. An example of this idea follows:

USE tempdb
GO
CREATE FUNCTION dbo.f (@p1 INT, @p2 INT)
RETURNS INT
WITH SCHEMABINDING, RETURNS NULL ON NULL INPUT
AS BEGIN RETURN CASE WHEN @p1 = 1 AND @p2 = 1 THEN 0 WHEN @p1 = 1 AND @p2 = 2 THEN 1 WHEN @p1 = 2 AND @p2 = 1 THEN 2 ELSE 3 END END
GO
CREATE TABLE dbo.Demo (A INT NOT NULL, B INT NOT NULL, C AS dbo.f (A, B) PERSISTED);
GO
INSERT dbo.Demo (A, B)
SELECT TOP (25000)
CONVERT(INT,RAND(CHECKSUM(NEWID())) * 2 + 1),
CONVERT(INT,RAND(CHECKSUM(NEWID())) * 2 + 1)
FROM master.sys.columns c1
CROSS
JOIN master.sys.columns c2
GO
SELECT *
FROM dbo.Demo;
GO
DROP TABLE dbo.Demo;
DROP FUNCTION dbo.f;

Paul




Paul White
SQL Server MVP
SQLblog.com
@SQL_Kiwi
Post #737152
Posted Thursday, June 18, 2009 10:47 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, June 10, 2010 10:22 AM
Points: 3, Visits: 41
Thanks all. Much appreciated!


Post #737769
Posted Saturday, June 20, 2009 1:57 PM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Sunday, September 28, 2014 1:41 PM
Points: 959, Visits: 2,888
Paul,
I've kind of forgotten what the advantage to SCHEMABINGING is for a scalar UDF. I think I read it somewhere, but I can't find it in my SQL saved SQL docs. What does this do exactly?
Todd Fifield
Post #738894
Posted Saturday, June 20, 2009 8:28 PM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Friday, October 17, 2014 8:13 AM
Points: 9,926, Visits: 11,188
tfifield (6/20/2009)
Paul,
I've kind of forgotten what the advantage to SCHEMABINGING is for a scalar UDF. I think I read it somewhere, but I can't find it in my SQL saved SQL docs. What does this do exactly?
Todd Fifield

Take a look at this: http://blogs.msdn.com/sqlprogrammability/archive/2006/05/12/596424.aspx

Cheers,

Paul




Paul White
SQL Server MVP
SQLblog.com
@SQL_Kiwi
Post #738940
Posted Saturday, June 20, 2009 9:12 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 8:45 PM
Points: 35,342, Visits: 31,875
Nice article and great explanation in the article... too bad they didn't actually test it for performance... Both of the following UDF's render identical execution plans and they both take turns winning on my desktop. That also means that compared to the inline code, they both make some nasty sucking sounds. I got the function code below from the article.


--===== Create a scalar function with SchemaBinding 
 CREATE FUNCTION dbo.ComputeNumSB(@i INT)
RETURNS INT
   WITH SCHEMABINDING
  BEGIN
        RETURN @i * 2 + 50
    END
GO
--===== Create a scalar function without SchemaBinding
 CREATE FUNCTION dbo.ComputeNum(@i INT)
RETURNS INT
  BEGIN
        RETURN @i * 2 + 50
    END
GO
--===== Create and populate the Tally table on the fly
 SELECT TOP 1000000 --equates to more than 30 years of dates
        IDENTITY(INT,1,1) AS N
   INTO dbo.Tally
   FROM Master.dbo.SysColumns sc1,
        Master.dbo.SysColumns sc2

--===== Add a Primary Key to maximize performance
  ALTER TABLE dbo.Tally
    ADD CONSTRAINT PK_Tally_N 
        PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100

--===== Allow the general public to use it
  GRANT SELECT ON dbo.Tally TO PUBLIC
GO
--===== Test more than once... the functions take turns winning.
     -- The inline code ALWAYS blows the functions away.
DECLARE @BitBucket INT
DECLARE @TestSize  INT
 SELECT @TestSize = 100000

SET STATISTICS TIME ON
 SELECT @BitBucket = dbo.ComputeNumSB(t.N)
   FROM dbo.Tally t
  WHERE t.N <= @TestSize

 SELECT @BitBucket = dbo.ComputeNum(t.N)
   FROM dbo.Tally t
  WHERE t.N <= @TestSize

 SELECT @BitBucket = t.N * 2 + 50
   FROM dbo.Tally t
  WHERE t.N <= @TestSize
SET STATISTICS TIME OFF
PRINT REPLICATE('=',80)
GO 6



For those that don't know what a Tally table is, now would be a good time to learn. Please see the following article...
http://www.sqlservercentral.com/articles/T-SQL/62867/

I agree with most of the other's said here... a UDF is probably not the way to go here. An aux table or a bit of inline code is the way to go. Here's the run results from the code above...

Beginning execution loop

SQL Server Execution Times:
CPU time = 8625 ms, elapsed time = 16175 ms.

SQL Server Execution Times:
CPU time = 8922 ms, elapsed time = 16226 ms.

SQL Server Execution Times:
CPU time = 78 ms, elapsed time = 78 ms.
================================================================================

SQL Server Execution Times:
CPU time = 8704 ms, elapsed time = 16103 ms.

SQL Server Execution Times:
CPU time = 8875 ms, elapsed time = 16358 ms.

SQL Server Execution Times:
CPU time = 78 ms, elapsed time = 76 ms.
================================================================================

SQL Server Execution Times:
CPU time = 8203 ms, elapsed time = 16051 ms.

SQL Server Execution Times:
CPU time = 8406 ms, elapsed time = 16075 ms.

SQL Server Execution Times:
CPU time = 78 ms, elapsed time = 94 ms.
================================================================================

SQL Server Execution Times:
CPU time = 8922 ms, elapsed time = 15994 ms.

SQL Server Execution Times:
CPU time = 8922 ms, elapsed time = 19858 ms.

SQL Server Execution Times:
CPU time = 78 ms, elapsed time = 81 ms.
================================================================================

SQL Server Execution Times:
CPU time = 8203 ms, elapsed time = 17882 ms.

SQL Server Execution Times:
CPU time = 8844 ms, elapsed time = 19245 ms.

SQL Server Execution Times:
CPU time = 94 ms, elapsed time = 86 ms.
================================================================================

SQL Server Execution Times:
CPU time = 8297 ms, elapsed time = 16888 ms.

SQL Server Execution Times:
CPU time = 8656 ms, elapsed time = 18983 ms.

SQL Server Execution Times:
CPU time = 78 ms, elapsed time = 79 ms.
================================================================================
Batch execution completed 6 times.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #738950
Posted Saturday, June 20, 2009 9:34 PM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Friday, October 17, 2014 8:13 AM
Points: 9,926, Visits: 11,188
Jeff Moden (6/20/2009)
Nice article and great explanation in the article... too bad they didn't actually test it for performance... Both of the following UDF's render identical execution plans and they both take turns winning on my desktop. That also means that compared to the inline code, they both make some nasty sucking sounds. I got the function code below from the article.[/font]

Hey Jeff,

Short answer: Try it with an UPDATE statement as the article does

Long answer in progress

Paul





Paul White
SQL Server MVP
SQLblog.com
@SQL_Kiwi
Post #738954
Posted Saturday, June 20, 2009 10:30 PM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Friday, October 17, 2014 8:13 AM
Points: 9,926, Visits: 11,188
Long answer:

SCHEMABINDING performs no magic with simple integer math - so we don't expect a difference with the SELECTs in your example code.

Where it does make a huge difference, is when the query optimizer (QO) must consider whether the function is deterministic or not. The QO has many fewer options when it encounters a non-deterministic function.

When marked as schema-bound, the engine analyzes the content of the function to decide on whether it is deterministic or not.

A scalar function which is not marked with SCHEMABINDING it is marked as doing data access - without analyzing the content of the function at all. This not only prevents it from be evaluated with deterministic optimizations, it also means that things like halloween protection become necessary in UPDATE statements, where they would not be otherwise. (This is the specific example in the article I linked to.)

The script below illustrates the dramatic differences that often appear.

USE tempdb;
GO
-- Conditional drops
IF OBJECT_ID(N'dbo.ComputeNum', N'FN') IS NOT NULL DROP FUNCTION dbo.ComputeNum;
IF OBJECT_ID(N'dbo.ComputeNumSB', N'FN') IS NOT NULL DROP FUNCTION dbo.ComputeNumSB;
IF OBJECT_ID(N'dbo.TestTable', N'U') IS NOT NULL DROP TABLE dbo.TestTable;
GO
-- Schema-bound (will be evaluated for determinism)
CREATE FUNCTION dbo.ComputeNumSB (@i INT) RETURNS INT WITH SCHEMABINDING AS BEGIN RETURN @i * 2 + 50 END;
GO
-- Not schema-bound, will be marked as doing data access
CREATE FUNCTION dbo.ComputeNum (@i INT) RETURNS INT AS BEGIN RETURN @i * 2 + 50 END;
GO
-- Create a test table which we can update
CREATE TABLE dbo.TestTable (A INT IDENTITY PRIMARY KEY, B INT NOT NULL)
GO
INSERT dbo.TestTable (B)
SELECT TOP (100000)
CONVERT(INT, 123456 * RAND(CHECKSUM(NEWID())))
FROM master.sys.columns C1
CROSS
JOIN master.sys.columns C2
GO
-- Check the properties of the functions
/*
SELECT OBJECTPROPERTYEX(OBJECT_ID(N'dbo.ComputeNum', N'FN'), 'IsDeterministic') -- 0 = No
SELECT OBJECTPROPERTYEX(OBJECT_ID(N'dbo.ComputeNum', N'FN'), 'SystemDataAccess') -- 1 = Yes
SELECT OBJECTPROPERTYEX(OBJECT_ID(N'dbo.ComputeNum', N'FN'), 'UserDataAccess') -- 1 = Yes

SELECT OBJECTPROPERTYEX(OBJECT_ID(N'dbo.ComputeNumSB', N'FN'), 'IsDeterministic') -- 1 = Yes
SELECT OBJECTPROPERTYEX(OBJECT_ID(N'dbo.ComputeNumSB', N'FN'), 'SystemDataAccess') -- 0 = No
SELECT OBJECTPROPERTYEX(OBJECT_ID(N'dbo.ComputeNumSB', N'FN'), 'UserDataAccess') -- 0 = No
*/
GO
--
-- TESTS
--
DECLARE @TestSize INT;
SELECT @TestSize = 100000;

SET STATISTICS IO ON;

-- Non-schema-bound version (compare the plans!)
BEGIN TRANSACTION
UPDATE dbo.TestTable
SET B = dbo.ComputeNum(B)
WHERE A BETWEEN 1 AND @TestSize
ROLLBACK

--Table 'TestTable'. Scan count 1, logical reads 200213, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
--Table 'Worktable'. Scan count 1, logical reads 201856, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

-- Schema-bound
BEGIN TRANSACTION
UPDATE dbo.TestTable
SET B = dbo.ComputeNumSB(B)
WHERE A BETWEEN 1 AND @TestSize
ROLLBACK

--Table 'TestTable'. Scan count 1, logical reads 213, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SET STATISTICS IO OFF;
GO
IF OBJECT_ID(N'dbo.ComputeNum', N'FN') IS NOT NULL DROP FUNCTION dbo.ComputeNum;
IF OBJECT_ID(N'dbo.ComputeNumSB', N'FN') IS NOT NULL DROP FUNCTION dbo.ComputeNumSB;
IF OBJECT_ID(N'dbo.TestTable', N'U') IS NOT NULL DROP TABLE dbo.TestTable;




Paul White
SQL Server MVP
SQLblog.com
@SQL_Kiwi
Post #738974
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse