BULK INSERT seems not to recognise line terminators

  • Dear Experts

    I am trying to get BULK IMPORT to work on a simple table, as follows:

    CREATE TABLE [dbo].[ebsx03](

    [varchar](8) NOT NULL,

    [MEANING] [varchar](50) NOT NULL,

    [DESCRIPTION] [varchar](255) NULL,

    [TYPE] [varchar](255) NULL,

    [USAGE] [varchar](255) NULL,

    PRIMARY KEY CLUSTERED

    ( ASC)

    I have a csv file containing a mere two lines, as follows, each terminated by carriage return and line feed in that order:

    "10000000","BOOKED","Booked","ADDITIONAL_REQUIREMENT_STATUS","ADDITIONALREQUIREMENTSTATUS"

    "10000001","REQUESTED","Requested","ADDITIONAL_REQUIREMENT_STATUS","ADDITIONALREQUIREMENTSTATUS"

    I am trying to import by using the following T-SQL:

    BULK INSERT dbo.ebsx03

    FROM '\\csu-000-sp01\Temp\ebsx03June_2015.csv'

    WITH (DATAFILETYPE='char',

    FIELDTERMINATOR=',',

    ROWTERMINATOR='\r',

    ERRORFILE='\\csu-000-sp01\Temp\BadLines.csv')

    Please note that though the ROWTERMINATOR line looks like a backslash r in my preview/browser, it is actually a backslash r backslash n in the copied text!!!

    When I do this, I get 1, yes just one, line inserted. The CODE, MEANING, DESCRIPTION and TYPE fields have their proper contents -- except that they also have the surrounding quotes. The final field on the line (USAGE), gets the contents of last field in the first line PLUS everything in the second line, including commas, joined into a single string, so, except that the field is folded by the formatting in the forum post. I have pasted it here:

    "ADDITIONALREQUIREMENTSTATUS"

    "10000001","REQUESTED","Requested","ADDITIONAL_REQUIREMENT_STATUS","ADDITIONALREQUIREMENTSTATUS"

    So, why is it seemingly missing the line terminator? I am also surprised that the quotes aren't being stripped. How hard can it be?

    Yours, praying for enlightenment.

    Mark Dalley

  • Remove the "backslash r" and keep only "backslash n".

    EDIT: To remove quotes, you need a format file. Check this thread for an example: http://www.sqlservercentral.com/Forums/FindPost1777839.aspx

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Man, it worked. I could have sworn I'd already tried that.

    A slight twist though, and an extra confusing factor - the original file was actually Unix format, with only line-feed-terminated lines (I used an editor to covert it to Windows format). Can I make the original file work?

    MarkD

  • Mark Dalley (4/15/2016)


    Man, it worked. I could have sworn I'd already tried that.

    A slight twist though, and an extra confusing factor - the original file was actually Unix format, with only line-feed-terminated lines (I used an editor to covert it to Windows format). Can I make the original file work?

    MarkD

    You could, but I haven't worked with a Unix format, so I can't tell you what configuration you need.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Hi Luis

    I am going to read and inwardly digest Jeff Moden's Spackle article[/url] that you is linked to in the thread you pointed to, and see if that suggests anything.

    Many thanks for the help. I definitely feel less at sea now.

    MarkD

  • Mark Dalley (4/15/2016)


    Man, it worked. I could have sworn I'd already tried that.

    A slight twist though, and an extra confusing factor - the original file was actually Unix format, with only line-feed-terminated lines (I used an editor to covert it to Windows format). Can I make the original file work?

    MarkD

    Yes. You the \ n terminator (without the space I used here to get it to show up).

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Luis Cazares (4/15/2016)


    Mark Dalley (4/15/2016)


    Man, it worked. I could have sworn I'd already tried that.

    A slight twist though, and an extra confusing factor - the original file was actually Unix format, with only line-feed-terminated lines (I used an editor to covert it to Windows format). Can I make the original file work?

    MarkD

    You could, but I haven't worked with a Unix format, so I can't tell you what configuration you need.

    They're the same except they use only \ n.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi Jeff and Luis

    OK, used a format file as suggested in Jeff's article, works smoothly. Since some of the fields contain embedded commas, I have changed the real delimiters to vertical bars. I have also removed the quotes. FWIW, format file is as follows:

    8.0

    5

    1SQLCHAR08000"|"1CODE ""

    2SQLCHAR08000"|"2MEANING ""

    3SQLCHAR08000"|"3DESCRIPTION ""

    4SQLCHAR08000"|"4TYPE ""

    5SQLCHAR08000""5USAGE ""

    -------------------------------------------- ---------

    FileDataPrefixDataEnd-OfTableTable Collation

    OrderTypeLengthMaxlenFieldCol #Col Name Name

    Delim

    Note: Record terminator is newline only (Unix style)

    Two more questions:

    1. Can I easily modify the format file to strip surrounding quotes so they don't wind up in the imported data?

    2. Can I handle embedded commas within quote-enclosed fields the way I ought to be able to (i.e the way Excel does? (suspect answer is no, but hope springs eternal)?

    Gotta go, back tomorrow UK time but thanks for help so far.

    Mark Dalley

  • Just for the record, why doesn't the backslash n after the fifth field in the format file show up, (unless one sticks a space between the backslash and the n, which I forgot to do)? I find it inconvenient.

    MarkD

  • I recently had a similar situation, and found the import-csv powershell cmdlet mixed with out-datatable/write-datatable cmdlets seemed to work the smoothest. Also easily handled parsing an array in one of the fields.

  • Mark Dalley (4/18/2016)


    Just for the record, why doesn't the backslash n after the fifth field in the format file show up, (unless one sticks a space between the backslash and the n, which I forgot to do)? I find it inconvenient.

    MarkD

    It's because of the forum software. A bit inconvenient but not terribly so once you're aware of the problem when you post.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • benjamin.reyes (4/18/2016)


    I recently had a similar situation, and found the import-csv powershell cmdlet mixed with out-datatable/write-datatable cmdlets seemed to work the smoothest. Also easily handled parsing an array in one of the fields.

    Hi Benjamin

    Thanks for mentioning this... Powershell is quite high on my list of desirable skills. I will tackle the stairway for it when I get a favourable moment. Parsing arrays into fields, eh... I love stuff you can use to bend things around your little finger, so to speak.

    Having finally got bulk insert to work, the adjectives that come to mind are quirky and brittle, but jolly fast. No, I can't handle embedded commas like Excel. Removing surrounding quotes seems possible by being clever about end-of-field terminators, the seeming exception being the leading quote on the first field. Rather than struggle with it I have just used a text editor to remove all quotes altogether.

    However, if anyone does know how one could remove the leading quote on the first field, I'd be very interested to hear.

    MarkD

  • Mark Dalley (4/18/2016)


    Hi Jeff and Luis

    OK, used a format file as suggested in Jeff's article, works smoothly. Since some of the fields contain embedded commas, I have changed the real delimiters to vertical bars. I have also removed the quotes. FWIW, format file is as follows:

    8.0

    5

    1SQLCHAR08000"|"1CODE ""

    2SQLCHAR08000"|"2MEANING ""

    3SQLCHAR08000"|"3DESCRIPTION ""

    4SQLCHAR08000"|"4TYPE ""

    5SQLCHAR08000""5USAGE ""

    -------------------------------------------- ---------

    FileDataPrefixDataEnd-OfTableTable Collation

    OrderTypeLengthMaxlenFieldCol #Col Name Name

    Delim

    Note: Record terminator is newline only (Unix style)

    Two more questions:

    1. Can I easily modify the format file to strip surrounding quotes so they don't wind up in the imported data?

    2. Can I handle embedded commas within quote-enclosed fields the way I ought to be able to (i.e the way Excel does? (suspect answer is no, but hope springs eternal)?

    Gotta go, back tomorrow UK time but thanks for help so far.

    Mark Dalley

    The leading quote is easy to handle, but might not be easy to find out without guidance.

    This example uses commas as field terminators and quotes on all the columns (and an extra space that needs to be removed). You can change it accordingly.

    8.0

    6

    1SQLCHAR08000"\""0DUMMY ""

    2SQLCHAR08000"\",\""1CODE ""

    3SQLCHAR08000"\",\""2MEANING ""

    4SQLCHAR08000"\",\""3DESCRIPTION ""

    5SQLCHAR08000"\",\""4TYPE ""

    6SQLCHAR08000"\"\ n"5USAGE ""

    -------------------------------------------- ---------

    FileDataPrefixDataEnd-OfTableTable Collation

    OrderTypeLengthMaxlenFieldCol #Col Name Name

    Delim

    Note: Record terminator is newline only (Unix style)

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • 8.0

    6

    1SQLCHAR08000"\""0DUMMY ""

    2SQLCHAR08000"\",\""1CODE ""

    3SQLCHAR08000"\",\""2MEANING ""

    4SQLCHAR08000"\",\""3DESCRIPTION ""

    5SQLCHAR08000"\",\""4TYPE ""

    6SQLCHAR08000"\"\ n"5USAGE ""

    -------------------------------------------- ---------

    FileDataPrefixDataEnd-OfTableTable Collation

    OrderTypeLengthMaxlenFieldCol #Col Name Name

    Delim

    Note: Record terminator is newline only (Unix style)

    OK, so embedded double quotes are expressed as \". And remove the space between the backslash and the n on field line 6.

    Ahhhh... a dummy "field" before the first real one, just to handle the leading quote on the first field... Now that is just sneakily awesome.:-P

    MarkD

  • Mark Dalley (4/19/2016)


    8.0

    6

    1SQLCHAR08000"\""0DUMMY ""

    2SQLCHAR08000"\",\""1CODE ""

    3SQLCHAR08000"\",\""2MEANING ""

    4SQLCHAR08000"\",\""3DESCRIPTION ""

    5SQLCHAR08000"\",\""4TYPE ""

    6SQLCHAR08000"\"\ n"5USAGE ""

    -------------------------------------------- ---------

    FileDataPrefixDataEnd-OfTableTable Collation

    OrderTypeLengthMaxlenFieldCol #Col Name Name

    Delim

    Note: Record terminator is newline only (Unix style)

    OK, so embedded double quotes are expressed as \". And remove the space between the backslash and the n on field line 6.

    Ahhhh... a dummy "field" before the first real one, just to handle the leading quote on the first field... Now that is just sneakily awesome.:-P

    MarkD

    Exactly! You've got it.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply