Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Which would be faster, an SSIS Foreach or SQL Cursor for 6300 rows?


Which would be faster, an SSIS Foreach or SQL Cursor for 6300 rows?

Author
Message
Andrew P
Andrew P
SSC Journeyman
SSC Journeyman (99 reputation)SSC Journeyman (99 reputation)SSC Journeyman (99 reputation)SSC Journeyman (99 reputation)SSC Journeyman (99 reputation)SSC Journeyman (99 reputation)SSC Journeyman (99 reputation)SSC Journeyman (99 reputation)

Group: General Forum Members
Points: 99 Visits: 805
I'm currently re-writing an inherited C#/SQL CSV import process due to speed issues with the previous process (reading each line with a C# StreamReader).

The CSV has 6300 rows, and 21 columns (mostly strings, a couple of dates), which are used to update figures in our internal database.

I'm currently unsure of which of the below ideas would be more efficient:
- A Foreach loop on all 21 columns, passing the 21 parameters to an Execute SQL Task and using to a Stored Procedure to update each row
- Modifying my Data Flow Control to dump the imported result set to an SQL table, and using an SQL Cursor to iterate through each row

I expect a cursor would be much more efficient than starting 6300 SSIS seperate Execute SQL tasks and passing all 21 parameters to it for each iteration of the foreach loop, but am unsure of this. Does anyone have any insight or guidance over which approach is more efficient? Any help would be greatly appreciated.
Matt Miller (#4)
Matt Miller (#4)
SSCertifiable
SSCertifiable (7.7K reputation)SSCertifiable (7.7K reputation)SSCertifiable (7.7K reputation)SSCertifiable (7.7K reputation)SSCertifiable (7.7K reputation)SSCertifiable (7.7K reputation)SSCertifiable (7.7K reputation)SSCertifiable (7.7K reputation)

Group: General Forum Members
Points: 7661 Visits: 18090
It's hard to tell at this point. Is there any way you can provide a bit more detail around the nature of the "update" you're doing?

If you're looking to update a single table with the values you're getting in, neither version of looping will be particularly efficient. You're be better off creating something set-based that performs the update. Assuming that is in the right neighborhood, look at bulk-loading the data into a working table, then running a SQL update. Unless you're doing something very strenuous, I'd expect an update affecting 6300 records to be *very* fast indeed.

That said - I'm guessing, because I don't know what is entailed in the update.

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
Jeff Moden
Jeff Moden
SSC-Forever
SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)

Group: General Forum Members
Points: 45329 Visits: 39936
sqlservercentral.com 32358 (5/4/2014)
I'm currently re-writing an inherited C#/SQL CSV import process due to speed issues with the previous process (reading each line with a C# StreamReader).

The CSV has 6300 rows, and 21 columns (mostly strings, a couple of dates), which are used to update figures in our internal database.

I'm currently unsure of which of the below ideas would be more efficient:
- A Foreach loop on all 21 columns, passing the 21 parameters to an Execute SQL Task and using to a Stored Procedure to update each row
- Modifying my Data Flow Control to dump the imported result set to an SQL table, and using an SQL Cursor to iterate through each row

I expect a cursor would be much more efficient than starting 6300 SSIS seperate Execute SQL tasks and passing all 21 parameters to it for each iteration of the foreach loop, but am unsure of this. Does anyone have any insight or guidance over which approach is more efficient? Any help would be greatly appreciated.


I'd have to say that either method is going to be relatively slow. Without know exactly how the data you import from the CSV will be used, I'll have to generalize a bit.

First and possibly not related to anything except for my general dislike (ok, it's a hatred :-)) for DTS, SSIS, Busiess Objects, Web Methods, and a shedload of other similar applications, I probably wouldn't go near SSIS with this task. There... like the little girl in the 6th sense said, "I feel much better now". :-P

If it were me, I'd Bulk Load (using either Bulk Insert or BCP depending on where the file was stored) the data into a staging table and evaluate/validate the data in each COLUMN (possibly, all columns at once). I wouldn't loop through each row.

After that, I'd write either a good ol' fashioned "upsert" (2005 and below) or use MERGE (2008 and above) to use the new data to update existing rows and to insert rows that don't already exist. At worst, it would be and Update followed by an Insert... not a process that did one row at a time... even if I were forced to do it in SSIS :-D.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
Although they tell us that they want it real bad, our primary goal is to ensure that we dont actually give it to them that way.
Although change is inevitable, change for the better is not.
Just because you can do something in PowerShell, doesnt mean you should. Wink

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Eirikur Eiriksson
Eirikur Eiriksson
SSCertifiable
SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)

Group: General Forum Members
Points: 6812 Visits: 17756
Quick question, why do you think you need a Cursor?
Cool
Koen Verbeeck
Koen Verbeeck
SSCoach
SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)

Group: General Forum Members
Points: 16518 Visits: 13210
sqlservercentral.com 32358 (5/4/2014)
I'm currently re-writing an inherited C#/SQL CSV import process due to speed issues with the previous process (reading each line with a C# StreamReader).

The CSV has 6300 rows, and 21 columns (mostly strings, a couple of dates), which are used to update figures in our internal database.

I'm currently unsure of which of the below ideas would be more efficient:
- A Foreach loop on all 21 columns, passing the 21 parameters to an Execute SQL Task and using to a Stored Procedure to update each row
- Modifying my Data Flow Control to dump the imported result set to an SQL table, and using an SQL Cursor to iterate through each row

I expect a cursor would be much more efficient than starting 6300 SSIS seperate Execute SQL tasks and passing all 21 parameters to it for each iteration of the foreach loop, but am unsure of this. Does anyone have any insight or guidance over which approach is more efficient? Any help would be greatly appreciated.


If you consider SSIS as an option, you are using an incorrect approach. Instead of using a for each loop and an execute SQL task, read the flat file with a data flow and use an OLE DB command to do the update. However, it still processes the data row by row and it fires off 6300 different update statements.

I would just dump the data into a SQL Server table and write a decent query on top of it.



How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?

Member of LinkedIn. My blog at SQLKover.

MCSA SQL Server 2012 - MCSE Business Intelligence
Andrew P
Andrew P
SSC Journeyman
SSC Journeyman (99 reputation)SSC Journeyman (99 reputation)SSC Journeyman (99 reputation)SSC Journeyman (99 reputation)SSC Journeyman (99 reputation)SSC Journeyman (99 reputation)SSC Journeyman (99 reputation)SSC Journeyman (99 reputation)

Group: General Forum Members
Points: 99 Visits: 805
Apologies for not responding to the answers here faster. You've all convinced me that the approach I'm going for with row based programming isn't the best practice. I had chosen this over set based because there's a set of processes that need followed depending on the values in each row. I've begun work re-writing this C#/SQL mix as a flow-chart, and will then try to re-write this using set based logic and BULK INSERT for the insert process.

Because this is so far from SSIS and the original post's topic, I'll start a new question in the correct forum if I can't work out how to complete the reimplementation under SQL. Thanks heaps for the concern of my overall approach, I'm trying to move from simply getting things done to using best practices, and this challenge should help with that.
Jeff Moden
Jeff Moden
SSC-Forever
SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)

Group: General Forum Members
Points: 45329 Visits: 39936
sqlservercentral.com 32358 (5/11/2014)
I had chosen this over set based because there's a set of processes that need followed depending on the values in each row.


Stop thinking like that. Start thing about the set of processes that each column needs, instead. Like the tagline in my signature line below says...

"Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
Although they tell us that they want it real bad, our primary goal is to ensure that we dont actually give it to them that way.
Although change is inevitable, change for the better is not.
Just because you can do something in PowerShell, doesnt mean you should. Wink

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search