Stripping out double quotes (") in bulk insert

  • Hi mate,

    Is there a way you can unzip/extract files in a directory via T-SQL? by that I mean without using SSIS or DTS packages.

  • Hi,

    I use 7-Zip's command line interface through xp_cmdshell to do this for a number of processes here.

    HTH,



    Ade

    A Freudian Slip is when you say one thing and mean your mother.
    For detail-enriched answers, ask detail-enriched questions...[/url]

  • I bought the pro copy of WINZIP... it comes with the command line version. It's only $39 US (maybe less now). No, it's not free... but it is supported and it will survive and SPA audit. Shareware won't and, a lot of times, you get what you pay for with "Free Ware".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hey everyone! I'm having a similar issue and I was wondering if you all could help. I'm trying to perform a bulk insert on MS-SQL 2005 from a text file that is updated on a weekly basis. If for some reason any of this is not easily readable, I'll be more that happy to attach text files.

    Here is the table that I'm trying to bulk insert into:

    CREATE TABLE [dbo].[DVD_Import](

    [DVD_Title] [varchar](128) NULL,

    [Studio] [varchar](30) NULL,

    [Released] [datetime] NULL,

    [Status] [varchar](15) NULL,

    [Sound] [varchar](20) NULL,

    [Versions] [varchar](20) NULL,

    [Price] [money] NULL,

    [Rating] [varchar](5) NULL,

    [Year] [varchar](5) NULL,

    [Genre] [varchar](20) NULL,

    [Aspect] [varchar](6) NULL,

    [UPC] [varchar](15) NULL,

    [DVD_ReleaseDate] [datetime] NULL,

    [ID] [int] IDENTITY(1,1) NOT NULL,

    [Timestamp] [datetime] NULL,

    [Updated] [smallint] NULL

    And here are a two sample rows of data and the header of the text file I want to bulk insert from:

    "DVD_Title","Studio","Released","Status","Sound","Versions","Price","Rating","Year","Genre","Aspect","UPC","DVD_ReleaseDate","ID","Timestamp","Updated"

    "60 Minutes: Story 1: March 16, 2008","CBS",,"Out","2.0","4:3",17.95,"NR","2008","Documentary","1.33:1","883629509075",2008-04-01 00:00:00,134122,2008-03-16 00:00:00,1

    "National Geographic: DogTown - Second Chances","National Geographic Video",2008-07-01 00:00:00,"Pending","2.0","4:3",19.98,"NR","UNK","Special Interest","1.33:1","727994753094",2008-07-01 00:00:00,134898,2008-04-02 00:00:00,0

    Using this bcp command:

    bcp DVDLib.dbo.DVD_Import format nul -n -U -P -f dvdimport.fmt

    I created a format file and modified it like so:

    9.0

    17

    1SQLCHAR 0 0 "\"" 0 Dummy1 ""

    2 SQLCHAR 0 128 "\",\"" 1 DVD_Title SQL_Latin1_General_CP1_CI_AS

    3 SQLCHAR 0 30 "\"," 2 Studio SQL_Latin1_General_CP1_CI_AS

    4 SQLDATETIME 0 8 ",\"" 3 Released ""

    5 SQLCHAR 0 15 "\",\"" 4 Status SQL_Latin1_General_CP1_CI_AS

    6 SQLCHAR 0 20 "\",\"" 5 Sound SQL_Latin1_General_CP1_CI_AS

    7 SQLCHAR 0 20 "\"," 6 Versions SQL_Latin1_General_CP1_CI_AS

    8 SQLMONEY 0 8 ",\"" 7 Price ""

    9 SQLCHAR 0 5 "\",\"" 8 Rating SQL_Latin1_General_CP1_CI_AS

    10 SQLCHAR 0 5 "\",\"" 9 Year SQL_Latin1_General_CP1_CI_AS

    11 SQLCHAR 0 20 "\",\"" 10 Genre SQL_Latin1_General_CP1_CI_AS

    12 SQLCHAR 0 6 "\",\""11 Aspect SQL_Latin1_General_CP1_CI_AS

    13 SQLCHAR 0 15 "\"," 12 UPC SQL_Latin1_General_CP1_CI_AS

    14 SQLDATETIME 0 8 "," 13 DVD_ReleaseDate ""

    15 SQLINT 0 4 "," 14 ID ""

    16 SQLDATETIME 0 8 "," 15 Timestamp ""

    17 SQLSMALLINT 0 2 "\r" 16 Updated ""

    And finally:

    BULK INSERT DVD_Import

    FROM 'C:\DVDLibew_csv.txt'

    WITH

    (

    FORMATFILE = 'C:\DVDLib\dvdimport.fmt',

    FIRSTROW = 2,

    DATAFILETYPE = 'native'

    )

    However, it isn't working at all. My goal is to have a C# console app that calls a stored procedure to perform a weekly bulk insert. I'm just before writing a procedure to dump all the data (quotes and all) into a temp table and strip out all the quotes before inserting the records into the destination table.

    I would really appreciate any insight and help you can provide...I'm getting really tired of looking at format files.

    Thanks,

    Joshua

  • You say it isn't working at all... do you get any kind of an error message?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Sorry about that! Here are the error messages I receive:

    Msg 4866, Level 16, State 7, Line 1

    The bulk load failed. The column is too long in the data file for row 1, column 2. Verify that the field terminator and row terminator are specified correctly.

    Msg 7301, Level 16, State 2, Line 1

    Cannot obtain the required interface ("IID_IColumnsInfo") from OLE DB provider "BULK" for linked server "(null)".

  • I haven't checked the entire BCP format file, but the delimiters for the first 4 columns are good...

    The problem is with the 3rd column of the HEADER (not the data rows). One of the very unfortunate parts about BCP is the the delimiters in the header, even if skipped, must be identical to the delimiters in the format file. Your header, 3rd column in highlighted below...

    "DVD_Title","Studio","Released","Status","Sound","Versions","Price","Rating","Year","Genre","Aspect","UPC","DVD_ReleaseDate","ID","Timestamp","Updated"

    Here's column 3 in the data (which happens to match the format file)...

    "National Geographic: DogTown - Second Chances","National Geographic Video",2008-07-01 00:00:00,"Pending","2.0","4:3",19.98,"NR","UNK","Special Interest","1.33:1","727994753094",2008-07-01 00:00:00,134898,2008-04-02 00:00:00,0

    See any difference in delimiters there? You may have other columns in the same fix so you need to make sure that delimiters in the header match the delimiters in the data match the delimiters in the BCP format file.

    Also, you are not using the "Prefix" column in the format file, not should you. What that means is that you need to change the datatype in the format file for ALL columns to SQLCHAR.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • OK, so if I'm understanding you correctly, the column headers need to match the delimiters as the data rows in the CSV file?

    Based on that, I changed the header row as follows:

    "DVD_Title","Studio",Released,"Status","Sound","Versions",Price,"Rating","Year","Genre","Aspect","UPC",DVD_ReleaseDate,ID,Timestamp,Updated

    I also reworked the format file to reflect the change to the header row and set the column prefixes to 0. When I first made the change, I forgot to update the format file to account for reading the header row. So I added a few dummy columns to process the beginning double quotes

    9.0

    26

    1SQLCHAR 2 0 "\"" 0 Dummy1 ""

    2 SQLCHAR 2 128 ""\," 1 DVD_Title SQL_Latin1_General_CP1_CI_AS

    3SQLCHAR 2 0 "\"" 0 Dummy2 ""

    4 SQLCHAR 2 30 ""\," 2 Studio SQL_Latin1_General_CP1_CI_AS

    5 SQLDATETIME 1 8 "," 3 Released ""

    6SQLCHAR 2 0 "\"" 0 Dummy3 ""

    7 SQLCHAR 2 15 ""\," 4 Status SQL_Latin1_General_CP1_CI_AS

    8SQLCHAR 2 0 "\"" 0 Dummy4 ""

    9 SQLCHAR 2 20 ""\," 5 Sound SQL_Latin1_General_CP1_CI_AS

    10SQLCHAR 2 0 "\"" 0 Dummy5 ""

    11 SQLCHAR 2 20 ""\," 6 Versions SQL_Latin1_General_CP1_CI_AS

    12 SQLMONEY 2 8 "," 7 Price ""

    13SQLCHAR 2 0 "\"" 0 Dummy6 ""

    14 SQLCHAR 2 5 ""\," 8 Rating SQL_Latin1_General_CP1_CI_AS

    15SQLCHAR 2 0 "\"" 0 Dummy7 ""

    16 SQLCHAR 2 5 ""\," 9 Year SQL_Latin1_General_CP1_CI_AS

    17SQLCHAR 2 0 "\"" 0 Dummy8 ""

    18 SQLCHAR 2 20 ""\," 10 Genre SQL_Latin1_General_CP1_CI_AS

    19SQLCHAR 2 0 "\"" 0 Dummy9 ""

    20 SQLCHAR 2 6 ""\," 11 Aspect SQL_Latin1_General_CP1_CI_AS

    21SQLCHAR 2 0 "\"" 0 Dummy10 ""

    22 SQLCHAR 2 15 ""\," 12 UPC SQL_Latin1_General_CP1_CI_AS

    23 SQLDATETIME 1 8 "," 13 DVD_ReleaseDate ""

    24 SQLINT 0 4 "," 14 ID ""

    25 SQLDATETIME 1 8 "," 15 Timestamp ""

    26 SQLSMALLINT 1 2 "\r" 16 Updated ""

    Based on that, it should process the header (and subsequent rows) as follows:

    1: "

    2: DVD_Title",

    3: "

    4: Studio",

    5: Released,

    6: "

    7: Status",

    8: "

    9: Sound",

    10: "

    11: Versions",

    12: Price,

    13: "

    14: Rating",

    15: "

    16: Year",

    17: "

    18: Genre",

    19: "

    20: Aspect",

    21: "

    22: UPC",

    23: DVD_ReleaseDate,

    24: ID,

    25: Timestamp,

    26: Updated

    However, when I try to perform the bulk insert, I get the following error message:

    Msg 4828, Level 16, State 1, Line 1

    Cannot bulk load. Invalid destination table column number for source column 2 in the format file "C:\DVDLib\dvdimport.fmt".

    Thanks again for the help!

  • jastarling (4/4/2008)


    OK, so if I'm understanding you correctly, the column headers need to match the delimiters as the data rows in the CSV file?

    Based on that, I changed the header row as follows:

    "DVD_Title","Studio",Released,"Status","Sound","Versions",Price,"Rating","Year","Genre","Aspect","UPC",DVD_ReleaseDate,ID,Timestamp,Updated

    I also reworked the format file to reflect the change to the header row and set the column prefixes to 0. When I first made the change, I forgot to update the format file to account for reading the header row. So I added a few dummy columns to process the beginning double quotes

    9.0

    26

    1SQLCHAR 2 0 "\"" 0 Dummy1 ""

    2 SQLCHAR 2 128 ""\," 1 DVD_Title SQL_Latin1_General_CP1_CI_AS

    3SQLCHAR 2 0 "\"" 0 Dummy2 ""

    4 SQLCHAR 2 30 ""\," 2 Studio SQL_Latin1_General_CP1_CI_AS

    5 SQLDATETIME 1 8 "," 3 Released ""

    6SQLCHAR 2 0 "\"" 0 Dummy3 ""

    7 SQLCHAR 2 15 ""\," 4 Status SQL_Latin1_General_CP1_CI_AS

    8SQLCHAR 2 0 "\"" 0 Dummy4 ""

    9 SQLCHAR 2 20 ""\," 5 Sound SQL_Latin1_General_CP1_CI_AS

    10SQLCHAR 2 0 "\"" 0 Dummy5 ""

    11 SQLCHAR 2 20 ""\," 6 Versions SQL_Latin1_General_CP1_CI_AS

    12 SQLMONEY 2 8 "," 7 Price ""

    13SQLCHAR 2 0 "\"" 0 Dummy6 ""

    14 SQLCHAR 2 5 ""\," 8 Rating SQL_Latin1_General_CP1_CI_AS

    15SQLCHAR 2 0 "\"" 0 Dummy7 ""

    16 SQLCHAR 2 5 ""\," 9 Year SQL_Latin1_General_CP1_CI_AS

    17SQLCHAR 2 0 "\"" 0 Dummy8 ""

    18 SQLCHAR 2 20 ""\," 10 Genre SQL_Latin1_General_CP1_CI_AS

    19SQLCHAR 2 0 "\"" 0 Dummy9 ""

    20 SQLCHAR 2 6 ""\," 11 Aspect SQL_Latin1_General_CP1_CI_AS

    21SQLCHAR 2 0 "\"" 0 Dummy10 ""

    22 SQLCHAR 2 15 ""\," 12 UPC SQL_Latin1_General_CP1_CI_AS

    23 SQLDATETIME 1 8 "," 13 DVD_ReleaseDate ""

    24 SQLINT 0 4 "," 14 ID ""

    25 SQLDATETIME 1 8 "," 15 Timestamp ""

    26 SQLSMALLINT 1 2 "\r" 16 Updated ""

    Based on that, it should process the header (and subsequent rows) as follows:

    1: "

    2: DVD_Title",

    3: "

    4: Studio",

    5: Released,

    6: "

    7: Status",

    8: "

    9: Sound",

    10: "

    11: Versions",

    12: Price,

    13: "

    14: Rating",

    15: "

    16: Year",

    17: "

    18: Genre",

    19: "

    20: Aspect",

    21: "

    22: UPC",

    23: DVD_ReleaseDate,

    24: ID,

    25: Timestamp,

    26: Updated

    However, when I try to perform the bulk insert, I get the following error message:

    Msg 4828, Level 16, State 1, Line 1

    Cannot bulk load. Invalid destination table column number for source column 2 in the format file "C:\DVDLib\dvdimport.fmt".

    Thanks again for the help!

    Heh... you're fighting me... You don't need to add the dummy columns... and you didn't change everything to SQLCHAR like I told you. You also didn't change PREFIXES to 0 like I told you. The PREFIX column in the column right after the SQLCHAR stuff.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • OK, sorry about that. I don't think I had my morning coffee yet when I read your message. So now I did the following:

    1.) Updated the header row (same as yesterday):

    "DVD_Title","Studio",Released,"Status","Sound","Versions",Price,"Rating","Year","Genre","Aspect","UPC",DVD_ReleaseDate,ID,Timestamp,Updated

    2.) Updated the format file:

    9.0

    17

    1SQLCHAR 0 0 "\"" 0 Dummy1 ""

    2 SQLCHAR 0 128 "\",\"" 1 DVD_Title SQL_Latin1_General_CP1_CI_AS

    3 SQLCHAR 0 30 "\"," 2 Studio SQL_Latin1_General_CP1_CI_AS

    4 SQLCHAR 0 8 ",\"" 3 Released ""

    5 SQLCHAR 0 15 "\",\"" 4 Status SQL_Latin1_General_CP1_CI_AS

    6 SQLCHAR 0 20 "\",\"" 5 Sound SQL_Latin1_General_CP1_CI_AS

    7 SQLCHAR 0 20 "\"," 6 Versions SQL_Latin1_General_CP1_CI_AS

    8 SQLCHAR 0 8 ",\"" 7 Price ""

    9 SQLCHAR 0 5 "\",\"" 8 Rating SQL_Latin1_General_CP1_CI_AS

    10 SQLCHAR 0 5 "\",\"" 9 Year SQL_Latin1_General_CP1_CI_AS

    11 SQLCHAR 0 20 "\",\"" 10 Genre SQL_Latin1_General_CP1_CI_AS

    12 SQLCHAR 0 6 "\",\""11 Aspect SQL_Latin1_General_CP1_CI_AS

    13 SQLCHAR 0 15 "\"," 12 UPC SQL_Latin1_General_CP1_CI_AS

    14 SQLCHAR 0 8 "," 13 DVD_ReleaseDate ""

    15 SQLCHAR 0 4 "," 14 ID ""

    16 SQLCHAR 0 8 "," 15 Timestamp ""

    17 SQLCHAR 0 2 "\r" 16 Updated ""

    And it works!! Thanks so much for your help!!

  • Outstanding! Thank you for the feedback!

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jonathan Dabbs - Thursday, May 4, 2006 4:17 AM

    We have a client that's still on SQL Server 6.5 (yes, I know, I struggled to remember that far back too!), and they asked me to do some work on importing a text file into a table.  Never used DTS or bulk import in the past, but didn't think this was going to be too much of an issue.  however, I'm struggling to get the data in using bulk insert without the double quotes that are included in the file (the quotes are needed as some columns have several commas within the same field, and should all go into the same column).

    The format of the file is csv, and here's a sample line.

    "Fred", "Bloggs", "123 Any Street, Any town, Any county, AB1 2CD", "Teacher".

    When I use bulk insert, it imports it with the quotes still in the file. i.e.

    select * from table;

    Forename    Surname     Address                                                               Occupation

    "Fred"         "Bloggs"      "123 Any Street, Any town, Any county, AB1 2CD" "Teacher"

    I'm obviously doing something very simple wrong, but can someone point out what this is, so I can import into the table without these quotes?

    Unfortunately, I can't amend the source file, as it's supplied externally.

    Many thanks

    Jonathan

  • Jonathan Dabbs - Thursday, May 4, 2006 4:17 AM

    We have a client that's still on SQL Server 6.5 (yes, I know, I struggled to remember that far back too!), and they asked me to do some work on importing a text file into a table.  Never used DTS or bulk import in the past, but didn't think this was going to be too much of an issue.  however, I'm struggling to get the data in using bulk insert without the double quotes that are included in the file (the quotes are needed as some columns have several commas within the same field, and should all go into the same column).

    The format of the file is csv, and here's a sample line.

    "Fred", "Bloggs", "123 Any Street, Any town, Any county, AB1 2CD", "Teacher".

    When I use bulk insert, it imports it with the quotes still in the file. i.e.

    select * from table;

    Forename    Surname     Address                                                               Occupation

    "Fred"         "Bloggs"      "123 Any Street, Any town, Any county, AB1 2CD" "Teacher"

    I'm obviously doing something very simple wrong, but can someone point out what this is, so I can import into the table without these quotes?

    Unfortunately, I can't amend the source file, as it's supplied externally.

    Many thanks

    Jonathan

    HI Jonathan,

    Could you please tell me how did you import the same data to SQL Server.
    "Fred", "Bloggs", "123 Any Street, Any town, Any county, AB1 2CD", "Teacher"

    I imported the data but due to ("123 Any Street, Any town, Any county, AB1 2CD") this row the comma(,) between the string is not able to import the data to sql server. how can i achieve. Please let me know.

  • vadrasrinivas44 - Monday, February 19, 2018 12:30 AM

    HI Jonathan,

    Could you please tell me how did you import the same data to SQL Server.
    "Fred", "Bloggs", "123 Any Street, Any town, Any county, AB1 2CD", "Teacher"

    I imported the data but due to ("123 Any Street, Any town, Any county, AB1 2CD") this row the comma(,) between the string is not able to import the data to sql server. how can i achieve. Please let me know.

    You just revived a thread that it's over 10 years old. I suggest that you read the following article about using using format files with bulk insert.
    Skipping Columns (& other tricks) Using BCP Format Files (SQL Spackle) - SQLServerCentral
    For your problem, you just need to add the quotes as part of the delimiter of the column. There is an example in this thread.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2

Viewing 14 posts - 46 through 58 (of 58 total)

You must be logged in to reply to this topic. Login to reply