Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Which would be faster, an SSIS Foreach or SQL Cursor for 6300 rows? Expand / Collapse
Author
Message
Posted Sunday, May 4, 2014 3:37 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Wednesday, July 2, 2014 9:01 PM
Points: 10, Visits: 103
I'm currently re-writing an inherited C#/SQL CSV import process due to speed issues with the previous process (reading each line with a C# StreamReader).

The CSV has 6300 rows, and 21 columns (mostly strings, a couple of dates), which are used to update figures in our internal database.

I'm currently unsure of which of the below ideas would be more efficient:
- A Foreach loop on all 21 columns, passing the 21 parameters to an Execute SQL Task and using to a Stored Procedure to update each row
- Modifying my Data Flow Control to dump the imported result set to an SQL table, and using an SQL Cursor to iterate through each row

I expect a cursor would be much more efficient than starting 6300 SSIS seperate Execute SQL tasks and passing all 21 parameters to it for each iteration of the foreach loop, but am unsure of this. Does anyone have any insight or guidance over which approach is more efficient? Any help would be greatly appreciated.
Post #1567372
Posted Sunday, May 4, 2014 6:41 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 9:16 PM
Points: 7,115, Visits: 14,984
It's hard to tell at this point. Is there any way you can provide a bit more detail around the nature of the "update" you're doing?

If you're looking to update a single table with the values you're getting in, neither version of looping will be particularly efficient. You're be better off creating something set-based that performs the update. Assuming that is in the right neighborhood, look at bulk-loading the data into a working table, then running a SQL update. Unless you're doing something very strenuous, I'd expect an update affecting 6300 records to be *very* fast indeed.

That said - I'm guessing, because I don't know what is entailed in the update.


----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
Post #1567379
Posted Sunday, May 4, 2014 7:58 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 11:19 PM
Points: 36,724, Visits: 31,173
sqlservercentral.com 32358 (5/4/2014)
I'm currently re-writing an inherited C#/SQL CSV import process due to speed issues with the previous process (reading each line with a C# StreamReader).

The CSV has 6300 rows, and 21 columns (mostly strings, a couple of dates), which are used to update figures in our internal database.

I'm currently unsure of which of the below ideas would be more efficient:
- A Foreach loop on all 21 columns, passing the 21 parameters to an Execute SQL Task and using to a Stored Procedure to update each row
- Modifying my Data Flow Control to dump the imported result set to an SQL table, and using an SQL Cursor to iterate through each row

I expect a cursor would be much more efficient than starting 6300 SSIS seperate Execute SQL tasks and passing all 21 parameters to it for each iteration of the foreach loop, but am unsure of this. Does anyone have any insight or guidance over which approach is more efficient? Any help would be greatly appreciated.


I'd have to say that either method is going to be relatively slow. Without know exactly how the data you import from the CSV will be used, I'll have to generalize a bit.

First and possibly not related to anything except for my general dislike (ok, it's a hatred ) for DTS, SSIS, Busiess Objects, Web Methods, and a shedload of other similar applications, I probably wouldn't go near SSIS with this task. There... like the little girl in the 6th sense said, "I feel much better now".

If it were me, I'd Bulk Load (using either Bulk Insert or BCP depending on where the file was stored) the data into a staging table and evaluate/validate the data in each COLUMN (possibly, all columns at once). I wouldn't loop through each row.

After that, I'd write either a good ol' fashioned "upsert" (2005 and below) or use MERGE (2008 and above) to use the new data to update existing rows and to insert rows that don't already exist. At worst, it would be and Update followed by an Insert... not a process that did one row at a time... even if I were forced to do it in SSIS .


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1567390
Posted Sunday, May 4, 2014 8:04 PM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Today @ 4:28 AM
Points: 1,236, Visits: 3,591
Quick question, why do you think you need a Cursor?
Post #1567391
Posted Monday, May 5, 2014 12:34 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Sunday, July 20, 2014 11:20 PM
Points: 13,252, Visits: 10,133
sqlservercentral.com 32358 (5/4/2014)
I'm currently re-writing an inherited C#/SQL CSV import process due to speed issues with the previous process (reading each line with a C# StreamReader).

The CSV has 6300 rows, and 21 columns (mostly strings, a couple of dates), which are used to update figures in our internal database.

I'm currently unsure of which of the below ideas would be more efficient:
- A Foreach loop on all 21 columns, passing the 21 parameters to an Execute SQL Task and using to a Stored Procedure to update each row
- Modifying my Data Flow Control to dump the imported result set to an SQL table, and using an SQL Cursor to iterate through each row

I expect a cursor would be much more efficient than starting 6300 SSIS seperate Execute SQL tasks and passing all 21 parameters to it for each iteration of the foreach loop, but am unsure of this. Does anyone have any insight or guidance over which approach is more efficient? Any help would be greatly appreciated.


If you consider SSIS as an option, you are using an incorrect approach. Instead of using a for each loop and an execute SQL task, read the flat file with a data flow and use an OLE DB command to do the update. However, it still processes the data row by row and it fires off 6300 different update statements.

I would just dump the data into a SQL Server table and write a decent query on top of it.




How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?

Member of LinkedIn. My blog at LessThanDot.

MCSA SQL Server 2012 - MCSE Business Intelligence
Post #1567417
Posted Sunday, May 11, 2014 7:49 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Wednesday, July 2, 2014 9:01 PM
Points: 10, Visits: 103
Apologies for not responding to the answers here faster. You've all convinced me that the approach I'm going for with row based programming isn't the best practice. I had chosen this over set based because there's a set of processes that need followed depending on the values in each row. I've begun work re-writing this C#/SQL mix as a flow-chart, and will then try to re-write this using set based logic and BULK INSERT for the insert process.

Because this is so far from SSIS and the original post's topic, I'll start a new question in the correct forum if I can't work out how to complete the reimplementation under SQL. Thanks heaps for the concern of my overall approach, I'm trying to move from simply getting things done to using best practices, and this challenge should help with that.
Post #1569663
Posted Sunday, May 11, 2014 8:01 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 11:19 PM
Points: 36,724, Visits: 31,173
sqlservercentral.com 32358 (5/11/2014)
I had chosen this over set based because there's a set of processes that need followed depending on the values in each row.


Stop thinking like that. Start thing about the set of processes that each column needs, instead. Like the tagline in my signature line below says...

"Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."




--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1569666
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse