Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


SSIS - Data Flow Task With Delete Very Slow


SSIS - Data Flow Task With Delete Very Slow

Author
Message
mberman
mberman
Forum Newbie
Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)

Group: General Forum Members
Points: 5 Visits: 18
I have a Data Flow Task that reads from a Flat File Source (orders.csv) and then does an OLE DB Command to do a delete from a SQL database table if a match is found. This is the SQL command in the OLE DB Command Task:
Delete orders from orders with(index(IX_orderno))
WHERE orderno = ? and ordersuf = ? and oetype in ('r','t','d') and cono = ?

(The index IX_orderno is defined as non-clustered and is orderno + ordersuf + oetype + cono. )

There are 400,000 records in the orders.csv and the delete is extremely slow (over 1 hour.)

We do have 8 indexes on this table, all non-clustered. Would this have any ramifications on the deletion speed?

Help would greatly appreciated.
mberman
mberman
Forum Newbie
Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)

Group: General Forum Members
Points: 5 Visits: 18
I tried with removing all the indexes from the table, except for the one being used, and then reran the task. The results were the same - no speed improvement.
OTF
OTF
Old Hand
Old Hand (316 reputation)Old Hand (316 reputation)Old Hand (316 reputation)Old Hand (316 reputation)Old Hand (316 reputation)Old Hand (316 reputation)Old Hand (316 reputation)Old Hand (316 reputation)

Group: General Forum Members
Points: 316 Visits: 4128
mberman (3/15/2013)
I have a Data Flow Task that reads from a Flat File Source (orders.csv) and then does an OLE DB Command to do a delete from a SQL database table if a match is found. This is the SQL command in the OLE DB Command Task:
Delete orders from orders with(index(IX_orderno))
WHERE orderno = ? and ordersuf = ? and oetype in ('r','t','d') and cono = ?

(The index IX_orderno is defined as non-clustered and is orderno + ordersuf + oetype + cono. )

There are 400,000 records in the orders.csv and the delete is extremely slow (over 1 hour.)

We do have 8 indexes on this table, all non-clustered. Would this have any ramifications on the deletion speed?

Help would greatly appreciated.


The OLE DB Command Task will execute a Sql Command for every row in an ssis dataset.
This is typically not what you would want to do for wholesale deletes as what this means is that your Sql Command is running, possibly thousands of times deleting one row each time.
You can check this by spinning up Profiler and you will most likely see thousands of Delete actions.

This is probably why the Delete is slow.

You could try importing the data from the flat file into a scratch/temp. table in the database and then perform a batch delete via JOINS/MERGE or some such.
OTF
OTF
Old Hand
Old Hand (316 reputation)Old Hand (316 reputation)Old Hand (316 reputation)Old Hand (316 reputation)Old Hand (316 reputation)Old Hand (316 reputation)Old Hand (316 reputation)Old Hand (316 reputation)

Group: General Forum Members
Points: 316 Visits: 4128
Just to clarify I meant the OLE DB Sql Command is run for every qualifying row not every row

I would add that the times I have seen the OLE DB Command used is normally to simulate some kind of MERGE. My experience has been it is faster to do that kind of thing in Pure Sql, if you are able to do so.
david.alcock
david.alcock
SSC Veteran
SSC Veteran (259 reputation)SSC Veteran (259 reputation)SSC Veteran (259 reputation)SSC Veteran (259 reputation)SSC Veteran (259 reputation)SSC Veteran (259 reputation)SSC Veteran (259 reputation)SSC Veteran (259 reputation)

Group: General Forum Members
Points: 259 Visits: 1158
Have you tried removing the index hint, just to guage any difference in performance?

What I would do, as suggested, is move all csv rows into a staging table, perform DELETE FROM ... JOIN function to remove matching rows. SQL is much better working on sets of data.

'Only he who wanders finds new paths'
Koen Verbeeck
Koen Verbeeck
SSCoach
SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)

Group: General Forum Members
Points: 16465 Visits: 13207
OTF (3/15/2013)


The OLE DB Command Task will execute a Sql Command for every row in an ssis dataset.
...

This is probably why the Delete is slow.


Not probably. It's with great certainty you can say the OLE DB command is the reason for the slow performance.



How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?

Member of LinkedIn. My blog at SQLKover.

MCSA SQL Server 2012 - MCSE Business Intelligence
mberman
mberman
Forum Newbie
Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)

Group: General Forum Members
Points: 5 Visits: 18
Thank you for your replies. The information was right on.
I created a scratch table in the database (orders_del) that is just being used to hold the order#'s to be deleted. (I do a truncate on the table first thing.)
Then the Execute SQL Command Taks does the delete:
delete FROM dbo.orders WHERE exists
(SELECT * FROM dbo.orders_del AS tmp
WHERE tmp.cono = dbo.orders.cono
AND tmp.orderno = dbo.orders.orderno
AND tmp.ordersuf = dbo.orders.ordersuf)

This reduced the time from 3 hours to 11 minutes!
OTF
OTF
Old Hand
Old Hand (316 reputation)Old Hand (316 reputation)Old Hand (316 reputation)Old Hand (316 reputation)Old Hand (316 reputation)Old Hand (316 reputation)Old Hand (316 reputation)Old Hand (316 reputation)

Group: General Forum Members
Points: 316 Visits: 4128
mberman (3/19/2013)
Thank you for your replies. The information was right on.
I created a scratch table in the database (orders_del) that is just being used to hold the order#'s to be deleted. (I do a truncate on the table first thing.)
Then the Execute SQL Command Taks does the delete:
delete FROM dbo.orders WHERE exists
(SELECT * FROM dbo.orders_del AS tmp
WHERE tmp.cono = dbo.orders.cono
AND tmp.orderno = dbo.orders.orderno
AND tmp.ordersuf = dbo.orders.ordersuf)

This reduced the time from 3 hours to 11 minutes!


I guess Koen was right... a dead certainty that it was the issue :-)

Glad you are getting better performance out of it now.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search