SSIS question on CSV Import

  • Many people may have faced this problem and I searched many solutions but somehow I could not hit the right one. Sorry for repeatative question.

    Here is my data in csv file and I am trying to import into sql server table but it does not import the way it should.

    field1,field2,field3,field4,field5,field6

    n1,n2,n3,stringWithout Quotes,"String in Quotes with ""words"" and coma , in the string",some more

    n1,n2,n3,"stringWith Quotes","String in Quotes with coma , in thestring",some more

    I need output

    Record 1:

    Field1: n1

    Field2: n2

    Field3: n3

    Field4: stringWithout Quotes

    Field5: String in Quotes with "words" and coma , in the string

    Field6: some more

    Record 2:

    Field1: n1

    Field2: n2

    Field3: n3

    Field4: stringWith Quotes

    Field5: String in Quotes with coma , in thestring

    Field6: some more

    I tried to cleanup with Regex in script task ,mentioned in some other blog but it did not work and I am not expert to change RegEx to get desired output. Can anyone please help to find solution? Regex should work but any other solution is also welcome.

    Thanks.

  • import of data from csv works based on delimeter or fixed length.

    Adress the delimeter before you import.

    Next define and format the data as needed.

  • mak101 (3/29/2013)


    Many people may have faced this problem and I searched many solutions but somehow I could not hit the right one. Sorry for repeatative question.

    Here is my data in csv file and I am trying to import into sql server table but it does not import the way it should.

    field1,field2,field3,field4,field5,field6

    n1,n2,n3,stringWithout Quotes,"String in Quotes with ""words"" and coma , in the string",some more

    n1,n2,n3,"stringWith Quotes","String in Quotes with coma , in thestring",some more

    I need output

    Record 1:

    Field1: n1

    Field2: n2

    Field3: n3

    Field4: stringWithout Quotes

    Field5: String in Quotes with "words" and coma , in the string

    Field6: some more

    Record 2:

    Field1: n1

    Field2: n2

    Field3: n3

    Field4: stringWith Quotes

    Field5: String in Quotes with coma , in thestring

    Field6: some more

    I tried to cleanup with Regex in script task ,mentioned in some other blog but it did not work and I am not expert to change RegEx to get desired output. Can anyone please help to find solution? Regex should work but any other solution is also welcome.

    Thanks.

    I'm certainly no SSIS guru but my understanding is that there's a "text delimiter" entry someone in the import task and if you set that to the " character, most of your problems will be solved. The catch is going to be that you might need to have whomever is providing the data to double up on the embedded double quotes.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • You could try manipulating the file first before you import it.

    Replace two times double quotes - "" - with another character, for example |.

    Then you import the file as usual, indicating double quote " as text delimiter.

    Once imported in SQL Server, replace | with a single double quote.

    Or choose a decent delimiter 😉

    edit: your CSV is invalid by the way. Either text is quoted or it isn't. The CSV file from the example mixes those two.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • I tried all above replies before posted this quetion. And no my csv format is not incorrect. If you cut and paste three lines in notepad and save the file as csv and open it you will see correct distribution. My problem is coma within double quotes, coma breakes that value in two different fields and moves field positions off from original fields.

    In SSIS I am trying to cleanup the file first with replacing double quotes with |~| with regex but I guess my regex is not working as I wanted. I tried the regex from some other blog but still missing something. So I am looking here regex help or any other option to import into correct format. Here is my regex copied from other blog:

    Regex.Replace(

    allText,

    @"""((?:""""|.)*?)""(?!"")",

    @"|~|$1|~|"

    ).Replace("\"\"", "\"")

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply