How to import data from a CSV file?

  • I have a vendor-supplied CSV file in the format:

    "Field1","Field2","Field3"

    As you can see, the fields are enclosed in double quotes and separated by commas.

    Some fields have commas within the data, for example, the value for Field2 could be "A,B,C"

    I tried to import data using BULK INSERT. When I specify the FIELDTERMINATOR as ',', bad things happen, the value in Field2 is broken on each comma, the enclosing double quotes are inserted as part of the value. When I specify the FIELDTERMINATOR as '","' , everything looks Ok, but the first field has a leading and the last field has a trailing double quote.

    Is there a way I could make BULK INSERT import the data from my CSV file properly?

  • Yes there is... you will need to build a BCP Format File, but it's worth it for the sheer speed of Bulk Insert...

    The format file would look something like this...

    8.0

    4

    1 SQLCHAR 0 1 "" 0 LeadingQuoteDiscarded ""

    2 SQLCHAR 0 500 "\",\"" 1 Field1 ""

    3 SQLCHAR 0 500 "\",\"" 2 Field2 ""

    4 SQLCHAR 0 500 "\"\r" 3 Field3 ""

    Note that \" is how you mark a quote in the delimiter column. It's not explicitly documented but it's worked that way forever.

    See "BCP Format File" in Books Online for a lot more information on the power of this tool.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "Dear Lord... I'm a DBA so please give me patience because, if you give me strength, I'm going to need bail money too!"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff,

    Thank you! That's an awesome hack you did with the leading and trailing quotes.

  • Thanks for the feedback, Sergey!

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "Dear Lord... I'm a DBA so please give me patience because, if you give me strength, I'm going to need bail money too!"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff,

    Sorry to bother you. You are the format file guru, could you please take a look at the new problem I have with BULK INSERT?

    http://www.sqlservercentral.com/Forums/Topic622222-145-1.aspx

Viewing 5 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply