order of rows during bulk insert

  • Hi,

    When a bulk insert is done from a flat file or a xls file, how do u ensure the rows are added in the same order as they are in the xls file?

    Basically, I want to load the data into a table which has an identity column which would need to store the rownumber of each record in the original flat file or xls file.

    Thanks.

  • Is the data in the spreadsheet ordered according to any particular column(s)? If so, import the data and create a clustered index on that/those column(s). If you insist on having an identity column as well, then import into a staging table and then move into a table with an identity column.

    John

  • We need to migrate data from a ms access db to sql server. As ms access preserves the records in the order in which they were actually inserted, and the application logic depends on it (the last() function in ms access) , we need to maintain the same order in the sql server db.

    So, we thought of having an identity col which can be used for replicating the last() functionality in ms access.

    The problem is, while migrating the existing data, the id column should be the actual position of that record in the original ms access db.

  • So is there no natural key in your Access table, something like create_date or order_number? If not, how about adding an extra column and populating that with an ascending number sequence? Then you can import to a staging table on the SQL Server and insert to your main table from there.

    Better still, take this opportunity to start using the relational database as it was intended, and create a "proper" key rather than the somewhat arbitrary order of insertion or physical storage.

    John

  • Hi,

    When a bulk insert is done from a flat file or a xls file, how do u ensure the rows are added in the same order as they are in the xls file?

    Basically, I want to load the data into a table which has an identity column which would need to store the rownumber of each record in the original flat file or xls file.

    Thanks.

    Create a sequential key in the spreadsheet and bulk insert with identity insert on.

    We need to migrate data from a ms access db to sql server. As ms access preserves the records in the order in which they were actually inserted, and the application logic depends on it (the last() function in ms access) , we need to maintain the same order in the sql server db.

    So, we thought of having an identity col which can be used for replicating the last() functionality in ms access.

    The problem is, while migrating the existing data, the id column should be the actual position of that record in the original ms access db.

    Why not use the database upgrade wizard to put it in SQL for you?

  • I find the BCP utility with the the -E parameter to be the best tool when dealing with identity values within a data row. BCP is the "Rodney Dangerfield" of the data movement methods. I no longer use the IMPORT/EXPORT Wizzard.

  • Is there a way to ensure that a flat file is loaded into sql in the exact sequence of the source file?

    I am having a similar concern with a flat file. I need to ensure that the data loaded is loaded in the same sequence found in the flat file source. I have set up the target table with an identity column which has sequenced the rows 1,2,3,..etc. It appears that the first few rows match the source, but I need to confirm the file was loaded from top to bottom. Any help would be greatly appreciated. Thanks.



    Kind Regards,
    Tim
    :cool:[font="Courier New"]The stone age did not end due to a lack of stones, nor will the information age end due to a lack of data.[/font]:cool:

  • TIM.OGRADY (6/19/2010)


    Is there a way to ensure that a flat file is loaded into sql in the exact sequence of the source file?

    I am having a similar concern with a flat file. I need to ensure that the data loaded is loaded in the same sequence found in the flat file source. I have set up the target table with an identity column which has sequenced the rows 1,2,3,..etc. It appears that the first few rows match the source, but I need to confirm the file was loaded from top to bottom. Any help would be greatly appreciated. Thanks.

    Tim,

    it's usually better to start a new thread for a new issue. If you found a thread that's close to the problem you have, you could put a link into your thread. But that's just a side note - no need to start a new thread with this issue...

    The answer follows the pattern of this thread: do you have any kind of natural key that can be used to identify the row sequence of your source file (e.g. names, a date, a quantity or something else in ascending/descending order)?

    If so, you could use a self-join on your target table together with ROW_NUMBER() to ensure an ascending sequence.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Thanks Lutz,

    Unfortunately there is no true key to the data, and there are millions of rows (Manually adding them isn't going to work). The rows are ordered in a very specific relationship for example the data might look like this:

    1xxxxx

    2xxxxxxx

    3xxxxxxx

    3xxxxxxx

    3xxxxxxx

    1xxxx

    2xxxxxx

    2xxxxxx

    3xxx

    So 1 would be the header, 2 would be additional info, an 3 would be the details. however, the 3 are only related to the twos that are above it, and 2's are related only to the 1's above. I guess I am hoping that SSIS in sql2005 would always process and load the file from top to bottom. If this isn't the case, I'm hoping for a method to force it to do so.

    Again, Thanks.

    Tim



    Kind Regards,
    Tim
    :cool:[font="Courier New"]The stone age did not end due to a lack of stones, nor will the information age end due to a lack of data.[/font]:cool:

  • I strongly recommend you number the rows prior to import.

    I'm sure there are tools available on the web that will add line numbers. I, personally, use a DOS batch file I found on a German web site (comments and description ar all in German, so I don't think it would help that much...)



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Thank you Lutz



    Kind Regards,
    Tim
    :cool:[font="Courier New"]The stone age did not end due to a lack of stones, nor will the information age end due to a lack of data.[/font]:cool:

  • Yes SSIS will process it in the order that it finds it (or in the order that the source driver returns it). But you need to remember that there is no such thing as 'order' in SQL Server. The only way to gurantee the order that records are returned in is to use an order by query.

    Following on from this you must have a seperate field with the line number in order to guarantee that data will be returned in a certain order.

    I'm not familiar with the 'last' function in MS Access, but all equivalent functions in SQL Server require some sort of key to order by.

  • Thanks Nick,

    So would importing using SSIS into a table with an identity column work to ensure the original sequence? Thus a query like the following would match the source:

    Select Id_Column, *

    From <table>

    Order by Id_Column Asc



    Kind Regards,
    Tim
    :cool:[font="Courier New"]The stone age did not end due to a lack of stones, nor will the information age end due to a lack of data.[/font]:cool:

  • If you bulk insert the file into a table without a primary key on it, the records will be inserted in the order that they are in the file. After they are loaded, modify the table to add an identity field and set that as your primary key. This will ensure that the records are maintained in the order they were loaded.

    Steve Jimmo
    Sr DBA
    “If we ever forget that we are One Nation Under God, then we will be a Nation gone under." - Ronald Reagan

  • That would ensure that data flows through SSIS in the order that you have specified, yes.

    But say it loads it into a target table called MyTable

    If you then select it out of the database like this:

    SELECT * FROM MyTable

    Then there is NO guarantee whatsoever of the order that it will be returned in.

    If you then select it out of the database like this:

    SELECT * FROM MyTable

    ORDER BY ID_Column

    Then it will come out in the order that you require.

    Even if you create a clustered index on ID_Column in your table there is still no guarantee that data will come out in a specific order if you do not specify an order by.

    I understand that your data has header, detail, footer rows so the 'order' is important. Are you relying on this order to perform processing in SSIS or are you relying on this to perform processing in the database?

    Basically you must save some kind of 'order' key along with the data in your table. Thats the only way to guarantee that you can return it in a certain order. The order in which the data is loaded has nothing to do with it.

Viewing 15 posts - 1 through 15 (of 42 total)

You must be logged in to reply to this topic. Login to reply