Simple Method for Importing Ragged files

  • I'm not sure that works because it's not a fixed width file, it's a chunk of text that may not have any delimiters in it at all and BCP still needs to know the ROW delimiter? If I'm wrong please do correct me, it would be very helpful.

    I guess an easier way to ask this question is how would you take a flat file that has these three lines of text below:

    TEST1 ABC

    TEST2

    TEST3 DEFG

    and insert them into a table using SSIS with the ROW delimiter being every 3rd character so that your rows in your table would be:

    TES

    T1

    ABC

    TES

    T2

    TES

    T3

    DEF

    G

  • Hi johnr,

    The sample you posted appears to have row delimiters in it. If there are no row delimiters at all, and the file looks like this:

    TEST1ABCTEST2DEFTEST3 TEST4GHI

    Then you'll probably have some problems. If the file looks like this:

    TEST1ABC

    TEST2DEF

    TEST3

    TEST4GHI

    And there actually are row delimiters, then the BCP format file trick should work. What you seem to be indicating though, is that you want to essentially rotate the columns and rows. I believe there's an SSIS transformation to do something similar, although if not you can bulk load into a temp table and perform a few INSERT...SELECT FROM... statements to move the data from the temp table into your permanent table pretty easily.

  • yeah i realized after i posted this that i included row delimiters in the example but there really aren't in the file i'm talking about. it's to the same end though.

    i'm after an ssis method to import the file into a table based on a column delimiter of every (n) characters.

    appreciate the feedback.

    thanks so much,

    -j

  • You could load the entire file into a single VARCHAR(MAX) column and use SUBSTRING to split it up on the server or you may be able to find some functionality in SSIS that would essentially substring the data out at fixed-width positions... I'm just not 100% on what the best SSIS functionality to do that would be.

  • Hello,

    Would it be possible to email me the pacakge created in this example? I would like to use it as a guide since I am new to developing SSIS packages.

    Email: tcgeeks@live.com

    Thanks,

    Assudad

  • We can all jump through flaming hoops to accomplish our ETL. However, there are limits on what we can or should accept as input. If the data format can't be parsed with consistent business rules that can be coded in T-SQL or C# or VB.Net, then we really can't process the file. Columns have to either be fixed-length or have a well defined delimiter. Same with row termination. It needs to be something that we can parse with the tools at hand.

    We have powerful tools, bcp, bulk insert, SSIS and others, but they can't guess at what the data is supposed to be. They have to be told what the data is supposed to be, and then when it isn't we must handle the exceptions.

    IMHO, It is acceptable to go back to the people who own the source data and say to them that a data format is unacceptable for specific reasons. Otherwise we need to hire a roomful of 10-key data entry clerks. Remember those days?

    Brandon_Forest@sbcglobal.net

  • I get these files which are basically text streams and they are only delimited by record.

    I've written a utlity to handle it now (it's slow and difficult to manage on all my remote sites) but is there anything already in SSIS that would just delimit the row at a specified length?

    If you just need to break at a specified length, I would look at something like sed to add a row delimiter:

    sed s/.../(&)"\ n"/g

    (remove the space in \ n)

    should add a CRLF after every 3 characters.

    Dump the output into a new file and process it using bcp.

    --

    JimFive

    --Edit to fix filtered text.

Viewing 7 posts - 46 through 51 (of 51 total)

You must be logged in to reply to this topic. Login to reply