Data in tthe same order in which it is inserted

  • Hi,

    I am inserting data into a table from a flat file. I want the data in the table to be in the same order what flat file has. Is there any way i can achieve this?

    Thanks!!

  • add an incrementing identity column to the destination table before running inserts?

  • pietlinden (3/29/2012)


    add an incrementing identity column to the destination table before running inserts?

    And then when you select from the table be sure to use an ORDER BY clause specifying that identity column.

  • Thanks for the suggestion. I tried adding identity column to the detination, this is not working. It shuffles the rows and then adds identity column. Data will be sorted in a differnt order to that of flat file.

    If I add data to empty table, it works fine. But when I add data to the table where there is already data, it shuffles and data will be inserted in a different order.

    Any suggestions, please?

  • Create the table and add the identity column before inserting any records. If you have to create a linked server and create and run an insert to your new table from that, or use SSIS... or...

  • Try adding the line number in the flat file itself, let us say at the beginning of each line.

    -----------------------------------------------------------[font=Arial Black]Time Is Money[/font][font=Arial Narrow]Calculating the Number of Business Hours Passed since a Point of Time[/url][/font][font=Arial Narrow]Calculating the Number of Business Hours Passed Between Two Points of Time[/font]

  • Well, can't really help you then. A table does not ensure order of the data. The only way to ensure order when selecting data from a table is to use an ORDER BY clause with the column or columns by which the data should be ordered.

  • Thanks for the suggestions. Table is already created and there is data already in the table. But whenever I insert data from now on, i need to see the data in the same order that of flat file. I am using SSIS to insert data.

    I cannot add the identity column in the flate file as the data in the file is huge.

  • Hi Lynn Pettis,

    I guess the data is inserting in differnt order due to page split. Do we have any options there, I dont know much of page split.

    Regards,

    Suhan

  • If you cannot add an IDENTITY (increment) to the destination table, then I do not think it can be done. Data in a table is inherently UNordered, unless you add an identity or maybe a timestamp column to force the order. That's hands down the easiest way to do it. If that does not work for you, I can't help - I'm out of ideas.

  • mohammed_suhan (3/29/2012)


    Thanks for the suggestions. Table is already created and there is data already in the table. But whenever I insert data from now on, i need to see the data in the same order that of flat file. I am using SSIS to insert data.

    I cannot add the identity column in the flate file as the data in the file is huge.

    I never add data from a flat file directly to a table which already contains data. I always go through a staging table for many reasons. Data validation is one of the reasons and, sometimes, maintaining the same order as the insert is another but only with an IDENTITY column. You might also have to prevent parallelism.

    In any case, without an IDENTITY column, you can't guarantee "order" in a table. Even with an IDENTITY column, you can't guarantee the order of the output without an ORDER BY. Why? Way too big a subject to talk about here but it's a fact.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The only way to ensure the data is not scrambled when inserting into a table is to reduce the "bacth commit size" to 1. Whether bcp, SSIS or BULK INSERT, there is an option to reduce the batch size to 1. Now, what does that do to performance? It kills it! It's analogous to calling an insert proc n-times in a row...but, it will avoid the overhead of adding a line number to the file to use to correlate your data file line number to the database row.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Lynn Pettis (3/29/2012)


    pietlinden (3/29/2012)


    add an incrementing identity column to the destination table before running inserts?

    And then when you select from the table be sure to use an ORDER BY clause specifying that identity column.

    Since we're coming in from an external flat file - one way to do this without the order by is...to use SSIS. And no - that doesn't mean to incorporate your existing SQL script in SSIS, you need rebuild the data file import process using the SSIS components.

    Using a script component, you can implement a script-based row number which will in essence create the sequence number in the order of the rows within the file. You don't have to drop your batch commit size to 1 in this scenario.

    You'd end up with the data ordered, which you then use in some way to bring it in to permanent table.

    Now this will bring in an entirely different can of worms, so be sure the effort is worth it. That said - it does the job.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • opc.three (3/30/2012)


    The only way to ensure the data is not scrambled when inserting into a table is to reduce the "bacth commit size" to 1. Whether bcp, SSIS or BULK INSERT, there is an option to reduce the batch size to 1. Now, what does that do to performance? It kills it! It's analogous to calling an insert proc n-times in a row...but, it will avoid the overhead of adding a line number to the file to use to correlate your data file line number to the database row.

    Please correct me if I'm wrong by I do not think rdbms engine will gently insert rows in order just because batch_commit_size is set to 1 - rows will be inserted into the proper table and that's all we can be sure will happen. Physical order can only be preserved by having a clustered unique index on a column that defines the order you want the rows to be physically stored.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • PaulB-TheOneAndOnly (3/30/2012)


    opc.three (3/30/2012)


    The only way to ensure the data is not scrambled when inserting into a table is to reduce the "bacth commit size" to 1. Whether bcp, SSIS or BULK INSERT, there is an option to reduce the batch size to 1. Now, what does that do to performance? It kills it! It's analogous to calling an insert proc n-times in a row...but, it will avoid the overhead of adding a line number to the file to use to correlate your data file line number to the database row.

    Please correct me if I'm wrong by I do not think rdbms engine will gently insert rows in order just because batch_commit_size is set to 1 - rows will be inserted into the proper table and that's all we can be sure will happen. Physical order can only be preserved by having a clustered unique index on a column that defines the order you want the rows to be physically stored.

    and even that is not bullet proof. the best way to accomplish this is some how adding an identity column to your tables. with SSIS you can do some transformation of the data (Add the identity column) and base that off of data all ready in a table (read last identity number and start there +1). or alter the database for an identity column and hope setting MAXDOP (1) will keep every thing in order. i know which one i would go with if i had to be sure every thing was in the same order.

    of course every query on the table will have to have an order by (ColmnToSortBy) to ensure the order comming out of the table.


    For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

    For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

    Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
    Jeff Moden's Cross tab and Pivots Part 1[/url]
    Jeff Moden's Cross tab and Pivots Part 2[/url]

Viewing 15 posts - 1 through 15 (of 34 total)

You must be logged in to reply to this topic. Login to reply