BULK INSERT eliminating quotes

  • Hi all, I'm attempting to import some Retrosheet baseball data into a MS SQL database. The main issue is that all of the character fields are double quoted ("text") ... which is not fun to handle. I'm using a format file to import it in - but I'm getting the following error:

    MSG 4828, Level 16, State 1, Line 2

    Cannot bulk load. Invalid destination table column number for source column 2 in the format file ...

    Any help would be greatly appreciated, Thanks! Here is the first line of the data my format file code:

    "ANA200904060","OAK",1,0,0,2,2,"CBFBS",0,0,"sweer001","L","sweer001","L","saunj001","L","saunj001","L","mathj001","morak001","kendh001","figgc001","aybae001","rivej001","huntt001","abreb001","","","","K","T","F",8,1,3,"T","T",0,"F","F",1,"F","F",0,"F","F",0,"","F","F","",0,0,"N",0,"N",0,"N",0,0,0,0,"2","","","","F","F","F","F","F","F","F","F","F","","","","T","F","F","F","F","","","","",0,2,0,0,0,0,0,0,0,1

    9.0

    98

    1 SQLCHAR 0 0 "\"" 0 dummy1 ""

    2 SQLCHAR 0 14 "\","\" 1 col1 ""

    3 SQLCHAR 0 5 "\"," 2 col2 ""

    4 SQLCHAR 0 0 "," 3 col3 ""

    5 SQLCHAR 0 0 "," 4 col4 ""

    6 SQLCHAR 0 0 "," 5 col5 ""

    7 SQLCHAR 0 0 "," 6 col6 ""

    8 SQLCHAR 0 0 ","\" 7 col7 ""

    9 SQLCHAR 0 50 "\"," 8 col8 ""

    10 SQLCHAR 0 0 "," 9 col9 ""

    11 SQLCHAR 0 0 ","\" 10 col10 ""

    12 SQLCHAR 0 10 "\","\" 11 col11 ""

    13 SQLCHAR 0 3 "\","\" 12 col12 ""

    14 SQLCHAR 0 10 "\","\" 13 col13 ""

    15 SQLCHAR 0 3 "\","\" 14 col14 ""

    16 SQLCHAR 0 10 "\","\" 15 col15 ""

    17 SQLCHAR 0 3 "\","\" 16 col16 ""

    18 SQLCHAR 0 10 "\","\" 17 col17 ""

    19 SQLCHAR 0 3 "\","\" 18 col18 ""

    20 SQLCHAR 0 10 "\","\" 19 col19 ""

    21 SQLCHAR 0 10 "\","\" 20 col20 ""

    22 SQLCHAR 0 10 "\","\" 21 col21 ""

    23 SQLCHAR 0 10 "\","\" 22 col22 ""

    24 SQLCHAR 0 10 "\","\" 23 col23 ""

    25 SQLCHAR 0 10 "\","\" 24 col24 ""

    26 SQLCHAR 0 10 "\","\" 25 col25 ""

    27 SQLCHAR 0 10 "\","\" 26 col26 ""

    28 SQLCHAR 0 10 "\","\" 27 col27 ""

    29 SQLCHAR 0 10 "\","\" 28 col28 ""

    30 SQLCHAR 0 10 "\","\" 29 col29 ""

    31 SQLCHAR 0 50 "\","\" 30 col30 ""

    32 SQLCHAR 0 3 "\","\" 31 col31 ""

    33 SQLCHAR 0 3 "\"," 32 col32 ""

    34 SQLCHAR 0 0 "," 33 col33 ""

    35 SQLCHAR 0 0 "," 34 col34 ""

    36 SQLCHAR 0 0 ","\" 35 col35 ""

    37 SQLCHAR 0 3 "\","\" 36 col36 ""

    38 SQLCHAR 0 3 "\"," 37 col37 ""

    39 SQLCHAR 0 0 ","\" 38 col38 ""

    40 SQLCHAR 0 3 "\","\" 39 col39 ""

    41 SQLCHAR 0 3 "\"," 40 col40 ""

    42 SQLCHAR 0 0 ","\" 41 col41 ""

    43 SQLCHAR 0 3 "\","\" 42 col42 ""

    44 SQLCHAR 0 3 "\"," 43 col43 ""

    45 SQLCHAR 0 0 ","\" 44 col44 ""

    46 SQLCHAR 0 3 "\","\" 45 col45 ""

    47 SQLCHAR 0 3 "\"," 46 col46 ""

    48 SQLCHAR 0 0 ","\" 47 col47 ""

    49 SQLCHAR 0 3 "\","\" 48 col48 ""

    50 SQLCHAR 0 3 "\","\" 49 col49 ""

    51 SQLCHAR 0 3 "\","\" 50 col50 ""

    52 SQLCHAR 0 20 "\"," 51 col51 ""

    53 SQLCHAR 0 0 "," 52 col52 ""

    54 SQLCHAR 0 0 ","\" 53 col53 ""

    55 SQLCHAR 0 3 "\"," 54 col54 ""

    56 SQLCHAR 0 0 ","\" 55 col55 ""

    57 SQLCHAR 0 3 "\"," 56 col56 ""

    58 SQLCHAR 0 0 ","\" 57 col57 ""

    59 SQLCHAR 0 3 "\"," 58 col58 ""

    60 SQLCHAR 0 0 "," 59 col59 ""

    61 SQLCHAR 0 0 "," 60 col60 ""

    62 SQLCHAR 0 0 "," 61 col61 ""

    63 SQLCHAR 0 0 ","\" 62 col62 ""

    64 SQLCHAR 0 10 "\","\" 63 col63 ""

    65 SQLCHAR 0 10 "\","\" 64 col64 ""

    66 SQLCHAR 0 10 "\","\" 65 col65 ""

    67 SQLCHAR 0 10 "\","\" 66 col66 ""

    68 SQLCHAR 0 3 "\","\" 67 col67 ""

    69 SQLCHAR 0 3 "\","\" 68 col68 ""

    70 SQLCHAR 0 3 "\","\" 69 col69 ""

    71 SQLCHAR 0 3 "\","\" 70 col70 ""

    72 SQLCHAR 0 3 "\","\" 71 col71 ""

    73 SQLCHAR 0 3 "\","\" 72 col72 ""

    74 SQLCHAR 0 3 "\","\" 73 col73 ""

    75 SQLCHAR 0 3 "\","\" 74 col74 ""

    76 SQLCHAR 0 3 "\","\" 75 col75 ""

    77 SQLCHAR 0 10 "\","\" 76 col76 ""

    78 SQLCHAR 0 10 "\","\" 77 col77 ""

    79 SQLCHAR 0 10 "\","\" 78 col78 ""

    80 SQLCHAR 0 3 "\","\" 79 col79 ""

    81 SQLCHAR 0 3 "\","\" 80 col80 ""

    82 SQLCHAR 0 3 "\","\" 81 col81 ""

    83 SQLCHAR 0 3 "\","\" 82 col82 ""

    84 SQLCHAR 0 3 "\","\" 83 col83 ""

    85 SQLCHAR 0 10 "\","\" 84 col84 ""

    86 SQLCHAR 0 10 "\","\" 85 col85 ""

    87 SQLCHAR 0 10 "\","\" 86 col86 ""

    88 SQLCHAR 0 10 "\"," 87 col87 ""

    89 SQLCHAR 0 0 "," 88 col88 ""

    90 SQLCHAR 0 0 "," 89 col89 ""

    91 SQLCHAR 0 0 "," 90 col90 ""

    92 SQLCHAR 0 0 "," 91 col91 ""

    93 SQLCHAR 0 0 "," 92 col92 ""

    94 SQLCHAR 0 0 "," 93 col93 ""

    95 SQLCHAR 0 0 "," 94 col94 ""

    96 SQLCHAR 0 0 "," 95 col95 ""

    97 SQLCHAR 0 0 "," 96 col96 ""

    98 SQLCHAR 0 0 "\r" 97 col97 ""

  • Does the file you're trying to import have a header? If so, it must have exactly the same delimiters even to skip it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • No it doesn't ... just the data. The field names are provided in another file on their website. However, it seems after some more laborious searching and messing around I got all of it to work ... the key was adding a prefix value of 1 to the dummy column and the "\",\"" separator. Once I figured that out I could then determine what separator to use for the rest of the file:

    9.0

    98

    1 SQLCHAR 0 1 "" 0 dummy1 ""

    2 SQLCHAR 0 14 "\",\"" 1 col1 ""

    3 SQLCHAR 0 5 "\"," 2 col2 ""

    4 SQLCHAR 0 0 "," 3 col3 ""

    5 SQLCHAR 0 0 "," 4 col4 ""

    ...

    What a pain ... I wish MS SQL had an OPTIONALLY ENCLOSED statement like MySQL does ... would make things much easier. Thanks!

  • I agree... it's a huge pain. I think they made it that way to get folks to use SSIS. 😉

    As a side bar, there is another way but I don't believe that it will work on the 64 bit version

    of SQL Server because it uses OPENROWSET...

    /**********************************************************************************************************************

    Purpose:

    Script to read from a TSV (Tab Separated Values) table having an unknown number of columns of unknown data type.

    This demo is setup to read a given file called Artists_mini.txt located in C:\Temp. I'm saving the "any file"

    version for an article ;-)

    Author: Jeff Moden - 28 Aug 2008

    **********************************************************************************************************************/

    --===== Declare the local variables we'll need. Names are self-documenting

    DECLARE @RootKey SYSNAME,

    @Key SYSNAME,

    @Value_Name SYSNAME,

    @Type SYSNAME,

    @PreviousValue SYSNAME,

    @NewValue SYSNAME

    --===== Preset the "constants". These are self documenting as well

    SELECT @RootKey = N'HKEY_LOCAL_MACHINE',

    @Key = N'SOFTWARE\Microsoft\Jet\4.0\Engines\Text',

    @Value_Name = N'Format',

    @Type = N'REG_SZ',

    @NewValue = N'TabDelimited' --May be a character using N'Delimited(,)' where the comma is the character

    --Original setting is usually N'CSVDelimited' which handles quote text qualifiers too.

    --===== Remember the previous value so we can set it back

    EXEC Master.dbo.xp_Instance_RegRead

    @RootKey = @RootKey,

    @Key = @Key,

    @Value_Name = @Value_Name,

    @Value = @PreviousValue OUTPUT

    --===== Show what the original delimeter setting was set to.

    -- This, of course, may be commented out

    SELECT 'Previous delimiter setting = ' + @PreviousValue

    --===== All set... define the new temporary delimiter

    EXEC Master.dbo.xp_Instance_RegWrite

    @RootKey = @RootKey,

    @Key = @Key,

    @Value_Name = @Value_Name,

    @Type = @Type,

    @Value = @NewValue

    --===== Read the TAB delimited file without knowing any of the columns

    -- Notes: "Database =" identifies the directory the file is in

    -- The FROM clause identifies the file name

    SELECT *

    FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',

    'Text;Database=C:\Temp;HDR=YES;FMT=Delimited',

    'SELECT * FROM Artists_mini.txt')

    --===== Show the content of the table we just populated

    SELECT * FROM #MyHEad

    --===== Restore the original delimiter setting

    EXEC Master.dbo.xp_Instance_RegWrite

    @RootKey = @RootKey,

    @Key = @Key,

    @Value_Name = @Value_Name,

    @Type = @Type,

    @Value = @PreviousValue

    --===== Cleanup after the demo

    DROP TABLE #MyHead

    Be sure to read the comment about "Original setting is usually N'CSVDelimited' which handles quote text qualifiers too."

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Almost forgot... I believe that setting up a "Text" linked server may do the

    trick, as well. However, both OPENROWSET and a Text Linked Server will be

    a bit slower than either BCP or BULK INSERT with that nice format file you made.

    A huge side benefit to having such a format file is that it'll catch a lot of errors

    if errors occur. Other methods will sometimes still give you the bad data. Neither

    OPENROWSET or the Text Linked Server will give you the extra utility of either

    BCP or BULK INSERT. Things like sequestering bad rows for later analysis while

    allowing the good rows to still come in are quite valueable to me.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 5 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply