• sqlservercentral.com 32358 (5/4/2014)


    I'm currently re-writing an inherited C#/SQL CSV import process due to speed issues with the previous process (reading each line with a C# StreamReader).

    The CSV has 6300 rows, and 21 columns (mostly strings, a couple of dates), which are used to update figures in our internal database.

    I'm currently unsure of which of the below ideas would be more efficient:

    - A Foreach loop on all 21 columns, passing the 21 parameters to an Execute SQL Task and using to a Stored Procedure to update each row

    - Modifying my Data Flow Control to dump the imported result set to an SQL table, and using an SQL Cursor to iterate through each row

    I expect a cursor would be much more efficient than starting 6300 SSIS seperate Execute SQL tasks and passing all 21 parameters to it for each iteration of the foreach loop, but am unsure of this. Does anyone have any insight or guidance over which approach is more efficient? Any help would be greatly appreciated.

    I'd have to say that either method is going to be relatively slow. Without know exactly how the data you import from the CSV will be used, I'll have to generalize a bit.

    First and possibly not related to anything except for my general dislike (ok, it's a hatred :-)) for DTS, SSIS, Busiess Objects, Web Methods, and a shedload of other similar applications, I probably wouldn't go near SSIS with this task. There... like the little girl in the 6th sense said, "I feel much better now". 😛

    If it were me, I'd Bulk Load (using either Bulk Insert or BCP depending on where the file was stored) the data into a staging table and evaluate/validate the data in each COLUMN (possibly, all columns at once). I wouldn't loop through each row.

    After that, I'd write either a good ol' fashioned "upsert" (2005 and below) or use MERGE (2008 and above) to use the new data to update existing rows and to insert rows that don't already exist. At worst, it would be and Update followed by an Insert... not a process that did one row at a time... even if I were forced to do it in SSIS :-D.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)