Skipping Columns (& other tricks) Using BCP Format Files (SQL Spackle)

  • Comments posted to this topic are about the item Skipping Columns (& other tricks) Using BCP Format Files (SQL Spackle)

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Always funny to see what you categorize as a "short" article Jeff 😀

    It's a great one though and I'm sure I'll refer to it soon in the future when I or someone else is dealing with flat files.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Koen Verbeeck (1/23/2014)


    Always funny to see what you categorize as a "short" article Jeff 😀

    It's a great one though and I'm sure I'll refer to it soon in the future when I or someone else is dealing with flat files.

    BWAAAA-HAAA!!!! I guess after some of the 26 page marathon articles I've written, my measuring stick is all messed up. Thanks for the feedback, Koen. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I love any piece that ends with "crack filled".

    Good job on this one. It's to the point and gives good practical advice.

    I notice though that with everyone who talks about this topic, nobody ever

    says what the space between the cols is in the format file.

    Is it a tab, a couple spaces, can it be mixed, etc?

    So in your example would it be:

    1<space><space>SQLCHAR

    1<tab>SQLCHAR

    1<tab><space>SQLCHAR

    Or can it just be any whitespace we want...

    Watch my free SQL Server Tutorials at:
    http://MidnightDBA.com
    Blog Author of:
    DBA Rant – http://www.MidnightDBA.com/DBARant

    Minion Maintenance is FREE:

  • KenpoDBA (1/23/2014)


    I love any piece that ends with "crack filled".

    Good job on this one. It's to the point and gives good practical advice.

    I notice though that with everyone who talks about this topic, nobody ever

    says what the space between the cols is in the format file.

    Is it a tab, a couple spaces, can it be mixed, etc?

    So in your example would it be:

    1<space><space>SQLCHAR

    1<tab>SQLCHAR

    1<tab><space>SQLCHAR

    Or can it just be any whitespace we want...

    Good point. I've never tried tabs (I avoid tabs even in code because of the different tab sizes of whatever reader someone is using) but I would imagine that tabs would work as the correct "white space". I'll try it after work and see.

    Thanks for the feedback.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • And here I was, working out how to skip a column because I've been sent a csv file with 16 columns intended for a table that has 15...

    Thank you for a well-explained article.

  • Thanks, I have almost never used this tool so your article was particularly informative.

  • Thanks for the article Jeff. It's always good to have a nice explained reference.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Thanks folks. Heh... I aim to please... I sometimes miss but I'm always aiming. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Great article! Exellent choice on where to draw the line on its scope.

    Also, sometimes it's just as valuable to learn what can't be done as it is to learn what can be done because then you don't waste time trying to figure out how to do the impossible. So, thank you for confirming right away that BULK INSERT can't import spreadsheet files and for the reminder later on that BOL says using BULK INSERT to import CSV files is not supported.

    Jeff Moden - Article


    The real key is to get whomever or whatever is providing the data to provide it in a consistant manner just as you would do in a properly designed table.

    So true, but easier said than done! Sometimes you gotta take what you get. I've recently had to strip text qualifier quotes from TSVs prior to BULK INSERTING and wondered if I was missing out on some new command line or format file option that was available since the last time I read any documentation. Sadly, I see the answer is "No".

    I'm only going to mention one tiny correction in case you did it on purpose to see if anyone is paying attention:

    ]Jeff Moden - Article


    For example, the first "format line" is supposed to be for "ColC" of the table but I've used "Doesn't matter" here because it truly doesn't matter what's in this column.

    The posted example has "ColC", not "Doesn't matter". (BTW, I did not know this value was arbitrary, so thank you for pointing that out in your article.)

  • Apart from the fact that I needed this 4 years ago, excellent article.

  • Andy DBA (1/23/2014)


    Great article! Exellent choice on where to draw the line on its scope.

    Also, sometimes it's just as valuable to learn what can't be done as it is to learn what can be done because then you don't waste time trying to figure out how to do the impossible. So, thank you for confirming right away that BULK INSERT can't import spreadsheet files and for the reminder later on that BOL says using BULK INSERT to import CSV files is not supported.

    Jeff Moden - Article


    The real key is to get whomever or whatever is providing the data to provide it in a consistant manner just as you would do in a properly designed table.

    So true, but easier said than done! Sometimes you gotta take what you get. I've recently had to strip text qualifier quotes from TSVs prior to BULK INSERTING and wondered if I was missing out on some new command line or format file option that was available since the last time I read any documentation. Sadly, I see the answer is "No".

    I'm only going to mention one tiny correction in case you did it on purpose to see if anyone is paying attention:

    ]Jeff Moden - Article


    For example, the first "format line" is supposed to be for "ColC" of the table but I've used "Doesn't matter" here because it truly doesn't matter what's in this column.

    The posted example has "ColC", not "Doesn't matter". (BTW, I did not know this value was arbitrary, so thank you for pointing that out in your article.)

    If the quotes are consistent in the column, then you are, in fact, missing out on a format file option that MS didn't document because they "don't support CSVs". \" is what you use as part of the FIELDTEMINATOR column to represent double quotes. Importing such CSVs using BCP and BULK INSERT will be the subject of another article.

    Apologies for the error on the "Doesn't Matter" column notation. I had second thoughts about that and thought I had caught all of the places in the article that were based on that. Apparently, I missed one. I'll submit a correction for that. Thanks for bringing it to my attention.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (1/23/2014)


    If the quotes are consistent in the column, then you are, in fact, missing out on a format file option that MS didn't document because they "don't support CSVs". \" is what you use as part of the FIELDTEMINATOR column to represent double quotes. Importing such CSVs using BCP and BULK INSERT will be the subject of another article.

    Of course! Yes, I do remember coming across the approach where double quotes are included as part of the field terminator. There's also a way to "eat" the leading double quote from the first column, but I'll wait for your next article on the topic. I was able to take an approach where I didn't need format files, but I guess what I was really hoping for is a Magically-treat-double quotes-like-Excel-does command line option!:-D

  • I'm using SQL2012 and following your steps as carefully as possible.

    I have copy and paste your example code and data.

    When I run the BULK INSERT in Studio the result is error 'Cannot bulk load. Invalid column number in the format file...'

    I have googled for more info but no-one seems to have an answer that makes sense to me.

    Suggestions?

  • Ignore my last.

    When I copied the BULK INSERT statement I missed the last semi-colon. The error message was misleading.

    Yes Jeff. Works like a charm and thanks for posting it.

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply