BULK INSERT with format file problems

  • Hi all

    Could anyone shed any light on the problem I have using BULK INSERT please? It's not something I have ever really needed to use in anger. This is proof of concept at the moment, there are elements outside my control/not set in stone yet.

    Long story short, I have a need to load into a staging table a (fairly small - dozens of rows at a time) csv file, that contains the results of a web form enquiry - externally hosted, delivery method TBC.

    I have been testing with an anonymised copy (attached) contents below.

    your-name,your-email,your-telephone,your-message,g-recaptcha-response,created_date
    name1,email1,0700123456,Message1,captcha1,22/05/2017 00:45
    Name2,email2,0700246864,Message2,captcha2,05/06/2017 11:21

    The staging table is:

    CREATE TABLE WebEnquiry (
        weID INT IDENTITY PRIMARY KEY NOT NULL
    ,    weFullName NVARCHAR(150) NOT NULL
    ,    weEmail NVARCHAR(150) NOT NULL
    ,    weTelephone NVARCHAR(25) NOT NULL
    ,    weMessage NVARCHAR(4000) NOT NULL
    ,    weCreated DATETIME NOT NULL
    );

    I wanted to ignore the captcha column, so used Jeff Moden's excellent article as reference and created the format file attached and below:
    12.0
    6
    1    SQLNCHAR    2        300     ","         2    weFullname     Latin1_General_CI_AS
    2    SQLNCHAR    2        300     ","         3    weEmail        Latin1_General_CI_AS
    3    SQLNCHAR    2        50      ","         4    weTelephone    Latin1_General_CI_AS
    4    SQLNCHAR    2        8000    ","         5    weMessage      Latin1_General_CI_AS
    5    SQLNCHAR    2        8000    ","         0    notused        Latin1_General_CI_AS
    6    SQLNCHAR    2        24      "\r\n"      6    weCreated      ""
    ----- ------- ------ ------ ------------ ----- -------- ---------
    File Data Prefix Data End Of Field Table Table Collation
    Order Type Length Length Delimiter Col # Col Name Name

    I use the following statement to insert:


    BULK INSERT dbo.WebEnquiry
    FROM 'L:\BCPDrop\test.csv'
    WITH (
         BATCHSIZE        = 1000
    ,    CODEPAGE         = 'RAW'
    ,    DATAFILETYPE     = 'char'
    ,    FIRSTROW         = 2
    ,    FORMATFILE       = 'L:\BCPFormatFiles\test.fmt'
    ,    MAXERRORS        = 2000000000
    ,    ERRORFILE        = 'L:\BCPDrop\test.err'
    ,    TABLOCK

    );

    And get the following errors:

    Msg 4866, Level 16, State 7, Line 10
    The bulk load failed. The column is too long in the data file for row 1, column 1. Verify that the field terminator and row terminator are specified correctly.
    Msg 7301, Level 16, State 2, Line 10
    Cannot obtain the required interface ("IID_IColumnsInfo") from OLE DB provider "BULK" for linked server "(null)".

    I had hoped the process of writing this out would have made me realise the problem, that's worked before, but it hasn't this time.

    Googling, I have seen a few reports of the second error, msg 7301, and that suggests replacing \r\n with 0x0a, which hasn't helped. I cannot see why I am getting the length problem. I even tried creating a test file from scratch in Notepad and Notepad++ in case I was getting any odd artifacts/characters in case the supplied sample had been opened and saved in Excel.

    Using SQL Server 2014 Standard SP2 on Windows 2012 R2.

    Any help gratefully received.

    "Knowledge is of two kinds. We know a subject ourselves, or we know where we can find information upon it. When we enquire into any subject, the first thing we have to do is to know what books have treated of it. This leads us to look at catalogues, and at the backs of books in libraries."
    — Samuel Johnson
    I wonder, would the great Samuel Johnson have replaced that with "GIYF" now?

  • Quick suggestion, change the "\r\n" to "\n"
    😎

  • Eirikur Eiriksson - Tuesday, July 11, 2017 7:08 AM

    Quick suggestion, change the "\r\n" to "\n"
    😎

    Nope, sorry. Same message 🙁 I had previously tried \r\r\n and quoting all columns in the test.csv.

    Part of me is quite glad I haven't missed something immediately obvious and simple 🙂

    "Knowledge is of two kinds. We know a subject ourselves, or we know where we can find information upon it. When we enquire into any subject, the first thing we have to do is to know what books have treated of it. This leads us to look at catalogues, and at the backs of books in libraries."
    — Samuel Johnson
    I wonder, would the great Samuel Johnson have replaced that with "GIYF" now?

  • david.edwards 76768 - Tuesday, July 11, 2017 7:41 AM

    Eirikur Eiriksson - Tuesday, July 11, 2017 7:08 AM

    Quick suggestion, change the "\r\n" to "\n"
    😎

    Nope, sorry. Same message 🙁 I had previously tried \r\r\n and quoting all columns in the test.csv.

    Part of me is quite glad I haven't missed something immediately obvious and simple 🙂

    Thought it was worth the try.
    😎
    Quick question, why doesn't the staging table match the column definitions in the format file?

  • Eirikur Eiriksson - Tuesday, July 11, 2017 8:04 AM

    Thought it was worth the try.
    😎
    Quick question, why doesn't the staging table match the column definitions in the format file?

    I'll try anything at this point 🙂

    Table columns 2-6 match on description, doubled length to allow for SQLNCHAR  and used SQLNCHAR in format file for column 6 based on the notes in Jeff's article 

    The "Data Type" column is used to identify the data type of the field you're importing. You could specifiy the datatype of each filed but I think that's overkill because the staging table should be setup with the correct datatypes.  I only use SQLCHAR or SQLNCHAR for that reason.  It helps keep things simple.  (See the "References" section at the end of this article for more information)

    "Knowledge is of two kinds. We know a subject ourselves, or we know where we can find information upon it. When we enquire into any subject, the first thing we have to do is to know what books have treated of it. This leads us to look at catalogues, and at the backs of books in libraries."
    — Samuel Johnson
    I wonder, would the great Samuel Johnson have replaced that with "GIYF" now?

  • Did a quick test and it works fine by changing the format file and the staging table
    😎

    Format file
    12.0
    6
    1  SQLCHAR  0   300  ","   2  weFullname  "Latin1_General_CI_AS"
    2  SQLCHAR  0   300  ","   3  weEmail   "Latin1_General_CI_AS"
    3  SQLCHAR  0   50  ","   4  weTelephone  "Latin1_General_CI_AS"
    4  SQLCHAR  0   4000  ","   5  weMessage  "Latin1_General_CI_AS"
    5  SQLCHAR  0   4000  ","   0  notused   "Latin1_General_CI_AS"
    6  SQLCHAR  0   16  "\r\n"  6  weCreated  "Latin1_General_CI_AS"
    ----- ------- ------ ------ ------------ ----- -------- ---------
    File Data Prefix Data End Of Field Table Table Collation
    Order Type Length Length Delimiter Col # Col Name Name

    Staging table
    CREATE TABLE dbo.Import_WebEnquiry (
      weID    INT IDENTITY PRIMARY KEY NOT NULL
    ,  weFullName  NVARCHAR(150) NOT NULL
    ,  weEmail   NVARCHAR(150) NOT NULL
    ,  weTelephone  NVARCHAR(25) NOT NULL
    ,  weMessage  NVARCHAR(4000) NOT NULL
    ,  weCreated  NVARCHAR(25) NOT NULL
    );

  • Thanks Eirikur

    I had tried using SQLCHAR 0... instead of SQLNCHAR 2...  previously, I didn't get errors but did get 0 rows. ...Thinking back I might have been using a test file with no header and forgotten to change the FIRSTROW back to 1!

    It's been one of those weeks where every single time I get a train of though going I have been distracted by something or someone else needing all my concentration.

    I had just started going down a different tack in desperation. I added the "not needed" column to the staging table, dropped the surrogate key and used a natural composite key of Email and Created I could BULK INSERT without using a format file absolutely fine.

    None of this is helped by the fact that I am still not entirely sure what sort of file I am going to get for real.

    Thanks for all your time and help.

    David

    "Knowledge is of two kinds. We know a subject ourselves, or we know where we can find information upon it. When we enquire into any subject, the first thing we have to do is to know what books have treated of it. This leads us to look at catalogues, and at the backs of books in libraries."
    — Samuel Johnson
    I wonder, would the great Samuel Johnson have replaced that with "GIYF" now?

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply