Error when loading CSV file to SQL Server using SSIS. The Column Delimiter was not found

  • Can someone help me with loading the attached csv file database.
    The issue here is the CSV file has data with Double quotes and comma seperated values but there are some instances where one of the column has Doule quotes in the data also.(Show with Bold)

    ---FYI.. This is a single record in my CSV File....
    "33043000030000","24","MAGNOLIA PETROLEUM CO.","NORTH DAKOTA STATE A  1","NORTH DAKOTA STATE A","1","MAGNOLIA PETROLEUM CO.","NORTH DAKOTA STATE "A" #1","10/21/1950","5609","KIDDER","141 N","73 W","36","SENE","1980 FNL 810 FEL","WILDCAT","","","","VERTICAL","46.990158000000001","-99.864013","OG","DRY","","12/24/1950"

    I read online and people are suggesting me to write VB code.
    Can somone help me on this.
    Thank you.

  • kashyap4007 - Tuesday, August 28, 2018 6:11 PM

    Can someone help me with loading the attached csv file database.
    The issue here is the CSV file has data with Double quotes and comma seperated values but there are some instances where one of the column has Doule quotes in the data also.(Show with Bold)

    ---FYI.. This is a single record in my CSV File....
    "33043000030000","24","MAGNOLIA PETROLEUM CO.","NORTH DAKOTA STATE A  1","NORTH DAKOTA STATE A","1","MAGNOLIA PETROLEUM CO.","NORTH DAKOTA STATE "A" #1","10/21/1950","5609","KIDDER","141 N","73 W","36","SENE","1980 FNL 810 FEL","WILDCAT","","","","VERTICAL","46.990158000000001","-99.864013","OG","DRY","","12/24/1950"

    I read online and people are suggesting me to write VB code.
    Can somone help me on this.
    Thank you.

    Whoever provided the data didn't follow the rules for CSV and double up on the quotes that were embedded in a field.  And people suggesting the VB route apparently don't actually have a clue either because VB will split the line as if it had 2 extra fields, as well.

    Are all the lines screwed up like this?  If they're consistent, then no need for VB or anything else crazy.  But ALL the lines need to have the same very consistent mistake and then we can salvage the data pretty easily. 

    So is the mistake consistent on every line or not?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thank you Guru for the quick reply.
    I am attaching the sample data here.
    To answer your question, this is not consistent.

  • You've still not attached any data but... I'm not sure that you should.  Is there any private or sensitive information in the file?  And "guru" is just the level of my participation, not my name or handle.  My name is Jeff Moden... Look above people's avatars for their names.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff,
    This is public data which i got from website so I think i can share.
    Below is 2 more records.

    3.31E+1341AMERADA HESS CORPORATIONKERMIT HALVERSON     1KERMIT HALVERSON1AMERADA PETROLEUM CORP.KERMIT HALVERSON #111/30/19519705MCKENZIE152 N96 W26NESE1830 FSL  660 FELCLEAR CREEKVERTICAL47.953663-102.921996OGDRY4/22/1952
    3.31E+1342AMERADA HESS CORPORATIONBEAVER LODGE-MADISON UNIT     V-26BEAVER LODGE-MADISON UNITV-26AMERADA PETROLEUM CORP.N. DAKOTA STATE A" #1"12/11/19518595WILLIAMS156 N95 W16NESE1980 FSL  660 FELBEAVER LODGEMADISON911269|459128|002/11/1952|521|18|641VERTICAL48.332938-102.895553OGPA8/3/1984
  • The file has been built incorrectly.
    No parsing tool can fix it.
    Must be corrected at the source.

    "NORTH DAKOTA STATE A"",""1",
    "NORTH DAKOTA STATE ""A"" #1",
    and so on.

    _____________
    Code for TallyGenerator

  • kashyap4007 - Tuesday, August 28, 2018 8:37 PM

    Jeff,
    This is public data which i got from website so I think i can share.
    Below is 2 more records.

    3.31E+1341AMERADA HESS CORPORATIONKERMIT HALVERSON     1KERMIT HALVERSON1AMERADA PETROLEUM CORP.KERMIT HALVERSON #111/30/19519705MCKENZIE152 N96 W26NESE1830 FSL  660 FELCLEAR CREEKVERTICAL47.953663-102.921996OGDRY4/22/1952
    3.31E+1342AMERADA HESS CORPORATIONBEAVER LODGE-MADISON UNIT     V-26BEAVER LODGE-MADISON UNITV-26AMERADA PETROLEUM CORP.N. DAKOTA STATE A" #1"12/11/19518595WILLIAMS156 N95 W16NESE1980 FSL  660 FELBEAVER LODGEMADISON911269|459128|002/11/1952|521|18|641VERTICAL48.332938-102.895553OGPA8/3/1984

    You seem to have left the delimiters out.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • kashyap4007 - Tuesday, August 28, 2018 8:37 PM

    Jeff,
    This is public data which i got from website so I think i can share.
    Below is 2 more records.

    3.31E+1341AMERADA HESS CORPORATIONKERMIT HALVERSON     1KERMIT HALVERSON1AMERADA PETROLEUM CORP.KERMIT HALVERSON #111/30/19519705MCKENZIE152 N96 W26NESE1830 FSL  660 FELCLEAR CREEKVERTICAL47.953663-102.921996OGDRY4/22/1952
    3.31E+1342AMERADA HESS CORPORATIONBEAVER LODGE-MADISON UNIT     V-26BEAVER LODGE-MADISON UNITV-26AMERADA PETROLEUM CORP.N. DAKOTA STATE A" #1"12/11/19518595WILLIAMS156 N95 W16NESE1980 FSL  660 FELBEAVER LODGEMADISON911269|459128|002/11/1952|521|18|641VERTICAL48.332938-102.895553OGPA8/3/1984

    Whoever was saving the data in the CSV file made a rookie mistake of doing
     + '"' + strValue + '"'
    instead of 
    + QUOTENAME(strValue, '"') 
    or, if strValue can be possibly longer than 128 characters,
    + '"' + REPLACE(strValue, '"', '""') + '"'

    Put these 2 rows into Excel and save the file as CSV.
    You'll see how it should be done correctly.

    _____________
    Code for TallyGenerator

  • Sergiy - Tuesday, August 28, 2018 8:44 PM

    The file has been built incorrectly.
    No parsing tool can fix it.
    Must be corrected at the source.

    "NORTH DAKOTA STATE A"",""1",
    "NORTH DAKOTA STATE ""A"" #1",
    and so on.

    If the data is inconsistent in it's mistake, then I absolutely agree.  This is a rookie mistake on the part of whomever assembles the data and it's easily fixed at the source.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Is this a one off, or will this be regularly scheduled load?
    Try specifying just the comma delimiter in SSIS, load into a staging table and write a stored procedure to clean it up and load to the target table.

  • Joe Torre - Wednesday, August 29, 2018 11:45 AM

    Is this a one off, or will this be regularly scheduled load?
    Try specifying just the comma delimiter in SSIS, load into a staging table and write a stored procedure to clean it up and load to the target table.

    Better yet, do a BULK INSERT and sequester the bad rows as they occur and fix them separately.  There are "switches" in BULK INSERT that will allow that.  I imagine that SSIS can also do similar.  That would make it so only the "problem" rows would need any form of rework.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thank you all for the help.
    This is how I was able to resolve the issue.
    I did set the TextQualified to False on the columns where the issue is happening and used a Derived Column Transformation to remove the surrounding """" data.

  • kashyap4007 - Wednesday, August 29, 2018 12:58 PM

    Thank you all for the help.
    This is how I was able to resolve the issue.
    I did set the TextQualified to False on the columns where the issue is happening and used a Derived Column Transformation to remove the surrounding """" data.

    The original bad row that you posted had a comma embedded in a "field", did it not?  If so, this doesn't fix that.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • There is comma also in few fiends Jeff but it did not cause me any issues.
    Below are some wierd data in some of the columns:
    NORTH DAKOTA STATE A" #1"
    DEVONIAN,MADISON,SILURIAN
    324488|33755|0,180041|70622|0,81960|76|0

  • kashyap4007 - Wednesday, August 29, 2018 1:59 PM

    There is comma also in few fiends Jeff but it did not cause me any issues.
    Below are some wierd data in some of the columns:
    NORTH DAKOTA STATE A" #1"
    DEVONIAN,MADISON,SILURIAN
    324488|33755|0,180041|70622|0,81960|76|0

    I guess I don't understand how those commas didn't cause any issues.  I thought the data was comma delimited.  Considering your recent posts, is the data actually tab delimited?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 26 total)

You must be logged in to reply to this topic. Login to reply