BCP in tab separated file to a table

  • Hello guys

    I am trying to bcp a log file into a table. Fields are "tab" separated and rows are "crlf" separated.

    The rows of the file are uneven, in the sense, one row might have 35 tabs and the next one might have 10.

    The table i am trying to insert to has 40 columns (maximum no.of columns i calculated from the log file).

    I tried both these commands

    BCP.EXE TEST.dbo.LOG IN E:\syslog.TXT -c -e error.txt -T -S

    BCP.EXE TEST.dbo.LOG IN E:\syslog.TXT -c -t\t -r\r -e error.txt -T -S

    (i also tried using a format file from table)

    when I run this, the table is loaded with data but not in the intended way.This is what I have from the table

    If the 1st line in the text file has 35 columns and the row ends after it, in the table the 1st row has correct info until the 35th column and instead of going to the next row for the next line in file, it continues to use the next 5 columns in table before it goes to the next row. I think its not getting the row delimiter.

    How do I force it to next line. Can you please help me figure this out.

    Thanks

  • this is what they call a "ragged right" file format, and bcp and bulk insert cannot handle that;

    see this post for an example of how to do it, but the built in Import Export wizard can handle it as well, i believe.

    there are a number of posts by Jeff Moden to do this via BCP; he refer's to the issue as "ragged right";

    here's one example he posted to address the issue...it's not SSIS, but it's worth looking at:

    [ur]http://www.sqlservercentral.com/Forums/FindPost1067869.aspx[/url]

    --Originally posted by Jeff Moden somewhere in the sands of time:

    BCP file import headless ragged right

    Heh... that's known as a "Headerless Ragged Right" file and they're a bugger to import correctly. BCP and BULK insert require that all the rows have the same number of delimiters so you can't do an import of such a file with either.

    However... with a little help from a couple of DOS commands and a little prestidigitation with a text based linked server, it can be done with very little pain.

    First, I assume you have a file, like in your original post, in C:\Temp called Test01.txt and it looks like this...

    col1,col2,col3

    col1,col2

    col1,col2,col3,col4

    col1,col2,col,3,col4,col5

    Next, let's setup a linked server and give it the necessary login privs...

    --===== Create a linked server to the drive and path you desire.

    EXEC dbo.sp_AddLinkedServer TxtSvr,

    'Jet 4.0',

    'Microsoft.Jet.OLEDB.4.0',

    'C:\Temp',

    NULL,

    'Text'

    --===== Set up login mappings.

    EXEC dbo.sp_AddLinkedSrvLogin TxtSvr, FALSE, NULL, Admin, NULL

    Here comes the fun part... if we just read the file directly...

    --===== Query one of the files by using a four-part name.

    SELECT *

    FROM TxtSvr...[Test01^#txt]

    ... we get an awful mess that looks like this...

    F4 F5 col1 col2 col3

    col1 col2

    col4 col1 col2 col3

    col4 col5 col1 col2 col3

    (3 row(s) affected)

    notice that the first row was used as a header and that the data has been rearranged AND that two of the headers have been made up! We can fix that with a couple of simple DOS commands...

    --===== Create a header that identifies the columns we want... (any col names will do)

    EXEC Master.dbo.xp_CmdShell 'ECHO HdrCol1,HdrCol2,HdrCol3,HdrCol4,HdrCol5 > C:\Temp\Header.txt'

    --===== Create a new working file that has the header and the original file as one

    EXEC Master.dbo.xp_CmdShell 'COPY C:\Temp\Header.txt+C:\Temp\Test01.txt C:\Temp\MyWork.txt'

    Then, accessing the data in the file is a breeze... do with it what you want because it can now be referenced as a table with the correct column names...

    --===== Read the csv text file as if it were a table

    SELECT *

    FROM TxtSvr...[MyWork^#txt]

    HdrCol1 HdrCol2 HdrCol3 HdrCol4 HdrCol5

    col1 col2 col3

    col1 col2

    col1 col2 col3 col4

    col1 col2 col3 col4 col5

    (4 row(s) affected)

    If you need to drop the linked server after than, the following command will do nicely...

    EXEC dbo.sp_DropServer 'TxtSvr', 'DropLogins'

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Thanks for the info.

    I will try this procedure to see if this will work for us

  • That's from a pretty old post of mine. Jet drivers are not available on 64 bit machines. You'll need the new ACE drivers from MS. It's pretty easy to find the download for that. Look for "ACE Driver Download Microsoft" in Google and then use ONLY the one from MS. No telling what other sites have embedded.

    If you still have difficulty, post back.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thank you. I will try that

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply