Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

SSIS - Data Flow Task With Delete Very Slow Expand / Collapse
Author
Message
Posted Friday, March 15, 2013 3:26 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Wednesday, November 6, 2013 2:56 PM
Points: 4, Visits: 13
I have a Data Flow Task that reads from a Flat File Source (orders.csv) and then does an OLE DB Command to do a delete from a SQL database table if a match is found. This is the SQL command in the OLE DB Command Task:
Delete orders from orders with(index(IX_orderno))
WHERE orderno = ? and ordersuf = ? and oetype in ('r','t','d') and cono = ?

(The index IX_orderno is defined as non-clustered and is orderno + ordersuf + oetype + cono. )

There are 400,000 records in the orders.csv and the delete is extremely slow (over 1 hour.)

We do have 8 indexes on this table, all non-clustered. Would this have any ramifications on the deletion speed?

Help would greatly appreciated.
Post #1431420
Posted Friday, March 15, 2013 3:46 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Wednesday, November 6, 2013 2:56 PM
Points: 4, Visits: 13
I tried with removing all the indexes from the table, except for the one being used, and then reran the task. The results were the same - no speed improvement.
Post #1431427
Posted Friday, March 15, 2013 5:54 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Thursday, October 31, 2013 3:44 AM
Points: 314, Visits: 4,128
mberman (3/15/2013)
I have a Data Flow Task that reads from a Flat File Source (orders.csv) and then does an OLE DB Command to do a delete from a SQL database table if a match is found. This is the SQL command in the OLE DB Command Task:
Delete orders from orders with(index(IX_orderno))
WHERE orderno = ? and ordersuf = ? and oetype in ('r','t','d') and cono = ?

(The index IX_orderno is defined as non-clustered and is orderno + ordersuf + oetype + cono. )

There are 400,000 records in the orders.csv and the delete is extremely slow (over 1 hour.)

We do have 8 indexes on this table, all non-clustered. Would this have any ramifications on the deletion speed?

Help would greatly appreciated.


The OLE DB Command Task will execute a Sql Command for every row in an ssis dataset.
This is typically not what you would want to do for wholesale deletes as what this means is that your Sql Command is running, possibly thousands of times deleting one row each time.
You can check this by spinning up Profiler and you will most likely see thousands of Delete actions.

This is probably why the Delete is slow.

You could try importing the data from the flat file into a scratch/temp. table in the database and then perform a batch delete via JOINS/MERGE or some such.
Post #1431481
Posted Friday, March 15, 2013 5:59 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Thursday, October 31, 2013 3:44 AM
Points: 314, Visits: 4,128

Just to clarify I meant the OLE DB Sql Command is run for every qualifying row not every row

I would add that the times I have seen the OLE DB Command used is normally to simulate some kind of MERGE. My experience has been it is faster to do that kind of thing in Pure Sql, if you are able to do so.
Post #1431485
Posted Friday, March 15, 2013 6:36 AM


SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Today @ 1:53 AM
Points: 211, Visits: 845
Have you tried removing the index hint, just to guage any difference in performance?

What I would do, as suggested, is move all csv rows into a staging table, perform DELETE FROM ... JOIN function to remove matching rows. SQL is much better working on sets of data.


'Only he who wanders finds new paths'
Post #1431504
Posted Friday, March 15, 2013 7:28 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 10:26 AM
Points: 13,359, Visits: 10,223
OTF (3/15/2013)


The OLE DB Command Task will execute a Sql Command for every row in an ssis dataset.
...

This is probably why the Delete is slow.


Not probably. It's with great certainty you can say the OLE DB command is the reason for the slow performance.




How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?

Member of LinkedIn. My blog at LessThanDot.

MCSA SQL Server 2012 - MCSE Business Intelligence
Post #1431525
Posted Tuesday, March 19, 2013 11:09 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Wednesday, November 6, 2013 2:56 PM
Points: 4, Visits: 13
Thank you for your replies. The information was right on.
I created a scratch table in the database (orders_del) that is just being used to hold the order#'s to be deleted. (I do a truncate on the table first thing.)
Then the Execute SQL Command Taks does the delete:
delete FROM dbo.orders WHERE exists
(SELECT * FROM dbo.orders_del AS tmp
WHERE tmp.cono = dbo.orders.cono
AND tmp.orderno = dbo.orders.orderno
AND tmp.ordersuf = dbo.orders.ordersuf)

This reduced the time from 3 hours to 11 minutes!
Post #1432997
Posted Wednesday, March 20, 2013 8:06 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Thursday, October 31, 2013 3:44 AM
Points: 314, Visits: 4,128
mberman (3/19/2013)
Thank you for your replies. The information was right on.
I created a scratch table in the database (orders_del) that is just being used to hold the order#'s to be deleted. (I do a truncate on the table first thing.)
Then the Execute SQL Command Taks does the delete:
delete FROM dbo.orders WHERE exists
(SELECT * FROM dbo.orders_del AS tmp
WHERE tmp.cono = dbo.orders.cono
AND tmp.orderno = dbo.orders.orderno
AND tmp.ordersuf = dbo.orders.ordersuf)

This reduced the time from 3 hours to 11 minutes!


I guess Koen was right... a dead certainty that it was the issue

Glad you are getting better performance out of it now.

Post #1433237
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse