Fixed Length Flat File

  • Hi, I have a flat file with over 200 fixed length columns. I want to use SSIS to load the data into SQL Server.

    Is there a way to easily load them into SQL Server without parsing each column?

    I will appreciate any help. Thanks,

  • I am afraid there is no "silver bullet" solution.

    You can use a Data Flow task, with several flows if data goes into several tables, or a Script Task. In a Script Task you could define a dictionary of offsets and lengths by table and column name and do the parsing in one simple loop, which IMO would be easier to maintain than 200+ individual substrings.

    Sorry if my answer is too general. I could perhaps give a more specific opinion if I had more detail.

  • Than you for the suggestion.

  • Not sure I understand the problem?? Why not use the flat file connection, set it to fixed length for the format and load up the data?

    Steve.

  • Yes, that would be a data flow. However, you need one dataflow per target table, and for 200+ that would be difficult (although not impossible) to maintain.

  • However, you need one dataflow per target table, and for 200+ that would be difficult (although not impossible) to maintain.

    Maybe I missed it, but I didn't see the requirement to load each column to an individual table. If you have (or want to create) a target table with the ~ 200 columns, that's easy enough to do. You can then set up the flat file connection as fixed width, and work with SSIS on it's guesses at the widths (ie you'll end up having to make changes where the guesses are wrong). Once you have this, you have a single data flow that will load all required fields into the target table.

    Or, with more detail, posted by 'wcm' on 12-13-2010

    1. Add a Data Flow Task to the Control Flow.

    a)dbl-Click to enter the Data Flow Tab

    2. Add a Flat File Source to the Data Flow Tab

    a)Identify the Flat File Source in the Flat File Source Editor Dialog

    3. Now That you have the Connection manager for the for the flat file created:

    a)edit the Flat File Conn Manager

    b)View the Advance Properties and you will see "Data Type" property and it's default is

    String [DT_STR] and "OutputColumnWidth" is set to 50.

    c) To change one column you can highlight the column in the middle list box, and make the change

    d) you can also do multiplt selects to do several columns at once.

    I'm still not seeing the issue?? Or the concern about 'parsing every field'.

    Steve.

  • You do nolt have to load each column into a different table; however, if you are loading columns into more than one table (which is likely the case), you would create an SSIS Data Flow task that would be, in my experience, difficult to maintain.

    Been there, done that. Thanks, I do not need another t-shirt.

  • josetur12 (3/11/2011)


    Hi, I have a flat file with over 200 fixed length columns. I want to use SSIS to load the data into SQL Server.

    Is there a way to easily load them into SQL Server without parsing each column?

    I will appreciate any help. Thanks,

    You could use T-SQL. Start with a table that contains each column (order, name, starting position, size). Then, utilizing BULK INSERT, load the data into a #temp table. Run a select to get the data.

    This is a start... just expand upon it.

    DECLARE @Columns TABLE (ColumnOrder INT, ColumnName varchar(20), StartingPosition smallint, ColumnSize tinyint);

    INSERT INTO @Columns

    SELECT 1, 'Column1', 1, 10 UNION ALL

    SELECT 2, 'Column2', 11, 5 UNION ALL

    SELECT 3, 'Column3', 16, 12;

    DECLARE @TestData TABLE (RowID INT IDENTITY, RowData varchar(100));

    INSERT INTO @TestData

    SELECT 'y4gfbnenbeofnoihfgjq3fgwqgnq3ovqvqno' UNION ALL

    SELECT 'qgbughqofniufgonqgoinvoqfqnbqnqn3qfo';

    WITH cte AS

    (

    SELECT t.RowID, dt.*

    FROM @TestData t

    CROSS APPLY (SELECT ColumnOrder,

    ColumnName,

    ColumnValue = SubString(t.RowData, StartingPosition, ColumnSize)

    FROM @Columns) dt

    )

    SELECT RowID,

    Column1 = MAX(CASE WHEN ColumnName = 'Column1' THEN ColumnValue ELSE NULL END),

    Column2 = MAX(CASE WHEN ColumnName = 'Column2' THEN ColumnValue ELSE NULL END),

    Column3 = MAX(CASE WHEN ColumnName = 'Column3' THEN ColumnValue ELSE NULL END)

    FROM CTE

    GROUP BY RowID

    ORDER BY RowID;

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • #Columns is what I meant by "dictionary of columns". I csan only guess but I think that there will be some parsing involved.

  • josetur12 (3/11/2011)


    I have a flat file with over 200 fixed length columns

    We're just guess here because we can't see your file.

    Do you have a record layout for this file and can you attach a text file with the first 10 rows? Of course, if you do attach a file, make sure it's not violating any privacy or company private concerns.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thank you all again. WayneS' solution works.

  • josetur12 (3/13/2011)


    Thank you all again. WayneS' solution works.

    If you have the data to exercise Wayne's good solution, you might want to look into what a BCP format file does for performance. It contains mostly the same kind of data.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/13/2011)


    josetur12 (3/13/2011)


    Thank you all again. WayneS' solution works.

    If you have the data to exercise Wayne's good solution, you might want to look into what a BCP format file does for performance. It contains mostly the same kind of data.

    Dang it, you did it again. 😀 I was waiting to see if he tried this, and was then going to suggest the same.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • BWAA-HAA!!! Great minds DO think alike! Sorry to steal your thunder, Wayne. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • NP 😎

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply