Performance issue using UDF in bulk load jobs (SSIS)

Question

Performance issue using UDF in bulk load jobs (SSIS)

Mohammed Mohsin-392707

Mr or Mrs. 500

Points: 589
More actions
November 11, 2012 at 11:04 am

#270762

Hi All,
We have bulk load jobs configured using SSIS packages. All the load jobs were working perfectly fine until 2 days back. Suddenly, our load jobs are taking way longer than the normal time. The issue is identified to be the use of UDF as shown below,
Note: These jobs were working fine since 3 years without any issues. We load around 16 GB of data each day.
No changes to the server and\or SQL Server itself.
Any help would be greatly appreciated.
Thanks,
-Mohsin
ALTER FUNCTION [dbo].[convertdate_null] (@input varchar(50))
RETURNS VARCHAR(12)
AS
BEGIN
DECLARE @output DATETIME;
SELECT @output = CASE WHEN (ISDATE ( ( CASE WHEN Left(replace(ltrim(rtrim(@input)),'+',''), 1) = 1
THEN '20'
ELSE '19'
END + Right(ltrim(rtrim(@input)), 6))
) = 1)
THEN CONVERT(VARCHAR(12),CONVERT (DATETIME, CASE WHEN Left(replace(ltrim(rtrim(@input)),'+',''), 1) = 1
THEN '20'
ELSE '19'
END + Right(ltrim(rtrim(@input)), 6) ) )
ELSE
null
END
RETURN @output
END

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1003863 More actions · Answer 1

Mohammed Mohsin-392707 (11/11/2012)
Hi All,
We have bulk load jobs configured using SSIS packages. All the load jobs were working perfectly fine until 2 days back. Suddenly, our load jobs are taking way longer than the normal time. The issue is identified to be the use of UDF as shown below,
Note: These jobs were working fine since 3 years without any issues. We load around 16 GB of data each day.
No changes to the server and\or SQL Server itself.

The function is a scalar UDF which is a performance problem in and of itself. That, notwithstanding, things don't just suddenly start running slower unless something actually did change. Perhaps the target table of your load finally reached a tipping point where the existing statistics are no longer valid causing a change for the worse in the execution plan. Perhaps the server is using more memory for other things than it used to.

I recommend that you first rework the scalar UDF and the code or process that uses it to be an Inline Table Valued function that returns a scalar value. That will take care of some of the performance problem. Once that is done, I'd re-evaluate any slownness that remains.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Randy Knight SSC-Addicted Points: 468 More actions · Answer 2

How is the UDF used? In a join or where clause predicate? Select list?

Mohammed Mohsin-392707 Mr or Mrs. 500 Points: 589 More actions · Answer 3

Mohammed Mohsin-392707

Mr or Mrs. 500

Points: 589

November 11, 2012 at 10:36 pm

#1558005

In a select list.

Thanks,

-Mohsin

Dave Ballantyne SSC-Dedicated Points: 33566 More actions · Answer 4

Since SQL Server is generally quite poor a string manipulation, the obvious thing to do would be to move this logic into the layer that is most approriate ie the SSIS task.

Will be a relatively trivial thing to do.

Clear Sky SQL
My Blog[/url]

Mohammed Mohsin-392707 Mr or Mrs. 500 Points: 589 More actions · Answer 5

I am using this SQL within the data flow task of SSIS package. I am doing simple select from a stage table and inserting into the base table. Wherever date coumns are there in stage, I am using UDF to validate and return.

Any quick things I can take a look? Changing any thing related to UDF is a huge task as it has been used in more than 500 locations.

Thanks,

-Mohsin

Jeff Moden SSC Guru Points: 1003863 More actions · Answer 6

Mohammed Mohsin-392707 (11/12/2012)
I am using this SQL within the data flow task of SSIS package. I am doing simple select from a stage table and inserting into the base table. Wherever date coumns are there in stage, I am using UDF to validate and return.
Any quick things I can take a look? Changing any thing related to UDF is a huge task as it has been used in more than 500 locations.
Thanks,
-Mohsin

Then create a new iTVF UDF just for this task to start with. The current UDF is a scalar UDF and will always be a performance problem.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Randy Knight SSC-Addicted Points: 468 More actions · Answer 7

I agree with Jeff. Or for that matter since this is just in the select list, that scalar function can be turned into a case statement very easily.