SSIS - OleDB destination inserting to SQL Server in incorrect order

  • I have a data flow component that includes the following items -

    OleDB Source (Oracle)

    Script Component (Tranformation mode)

    OleDB Destination (SQL Server)

    There's quite a bit of processing done in the script component. I have primarily overridden Input0_ProcessInputRow, which based on many conditions called AddRow to the output buffer. This script component is finally outputting to OleDB destination.

    Now the problem I am facing is while there is an order in which rows are added to the output buffer it seems the SQL Server inserts aren't happening in the same order. My dataset is not that large (<1000 rows). The only thing is there is quite a bit of processing done in script component. If I spit the output to a flatfile, the order of rows matches exactly what's in the output buffer. Somewhere while records are getting actually inserted, the order is getting messed up.

    With simple pass through (minimal processing), the behavior is desired. I.e. rows are inserted in correct order. For our purpose the records need to be inserted in certain order.

    I suspect multithreaded engine (that data flow component uses) to be responsible but not sure how. Is there any settings that should be made?

    Any help would be great!

    Archna Monga

  • How do you know the insertion order is messed up in the data flow? How do you detect this? Do you simply do a SELECT against the result table?

    ---
    SSIS Tasks Components Scripts Services | http://www.cozyroc.com/

  • Yes, I am checking with Select statement to the table.

    Let me also provide a bit of more information. In Input0_ProcessInputRow method, I have the following logic(just putting a pseudocode):

    if initialDataNotWritten Then

    In a for loop add rows to Output Buffer (it adds around 10-15 rows)

    End If

    Do processing of input row, add a row to Output Buffer

    if input buffer has ended

    In a for loop add rows to Output Buffer (it adds some of the left over rows that weren't processed)

    End if

    ---------------

    So, there will be a bunch of rows added to Output Buffer one time (first input buffer row read) including the first input buffer row. Then once all of the rows in input buffer are exhausted, another set of rows added to Output Buffer (last input buffer read).

    If I connect the same script component to a flat file destination the data is absolutely correct. But with OleDB destination, order is messed up. It gives an idea that Output Buffer is correct. My understanding is OleDb destination performs a bulk insert (with fastload) so whatever is in output buffer should be inserted into table as-is. At what point order is messing is puzzling to me.

  • Your test is not conclusive and it doesn't proof the insertion order is incorrect. There are no guarantees in what order a relation database table will return the rows, when you don't explicitly define ORDER BY clause. This is relational technology 101.

    Now if you want to have the insertion order you have to create an additional column in your relational table and then during the loading populate this column with incremental value. Then when you do a SELECT statement you just have to include ORDER BY this column and you will have your correct insertion order.

    ---
    SSIS Tasks Components Scripts Services | http://www.cozyroc.com/

  • The select (same one) statement without ORDER BY should give results in order that records were physically added since there is no primary key and index set on the table.

    As a test I tried spitting results from script component to a flat file and then in a separate data flow, used flat file source to write to oledb destination (without any row-by-row processing or anything else in between). The results are as desired.

  • Btw, I should mention that we are not supposed to change table structure as it is created by an application. And once the records have been added, the table is consumed by Oracle EPM, which expects records to be physically in certain order (we do not have any control on the select statement implementation internal to that application).

  • archiemonga (7/12/2009)


    The select (same one) statement without ORDER BY should give results in order that records were physically added since there is no primary key and index set on the table.

    As a test I tried spitting results from script component to a flat file and then in a separate data flow, used flat file source to write to oledb destination (without any row-by-row processing or anything else in between). The results are as desired.

    There is no such requirement to return records in the order they were added, even if you don't have primary key and index. The test you have made doesn't proof this will always work. It might work only for a short number of records and then this is database specific and setup specific. So you should not depend on this.

    ---
    SSIS Tasks Components Scripts Services | http://www.cozyroc.com/

  • archiemonga (7/12/2009)


    Btw, I should mention that we are not supposed to change table structure as it is created by an application. And once the records have been added, the table is consumed by Oracle EPM, which expects records to be physically in certain order (we do not have any control on the select statement implementation internal to that application).

    You can modify the physical table to contain the additional column, but make the Oracle pull the data from a view instead of the physical table.

    ---
    SSIS Tasks Components Scripts Services | http://www.cozyroc.com/

Viewing 8 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply