Home Forums Data Warehousing Integration Services What happens if new records are inserted in a source table during a package execution? RE: What happens if new records are inserted in a source table during a package execution?

  • wolfkillj (5/13/2013)


    Jeff Moden (5/10/2013)


    wolfkillj (5/7/2013)


    Phil Parkin (5/7/2013)


    This is probably what you meant by "I can program the loading process in another way, I know." but to remove most of the doubt I like to code extracts like this off of an ever-increasing key, i.e. an IDENTITY column setup as a unique index or primary key. This way I will select the maximum ID into a variable, then select all rows that are less than or equal to that ID.

    +1 to this, me too.

    Ditto on the +1. If you want to do only an incremental extraction (instead of extracting ALL rows where ID < MAX(ID), create a log table with a column called extractedThroughID or something similar. When you begin the extraction, read the most recent extractedThroughID from the log table into a variable (lastExtractEndID, for example) and the MAX(ID) into another variable (e.g., currentExtractEndID) and extract only rows WHERE ID > lastExtractStartID AND ID <= currentExtractEndID. When the extraction process completes successfully, insert the value of currentExtractEndID in the log table as extractedThroughID so the extraction will start from there next time.

    I guess I'm he odd ball. I've had some situations (high speed staging tables fed by triggers on multiple other tables) where lower numbered rows were still in the process of being committed that what the MAX(ID) was and it caused some real headaches. I don't rely on MAX(ID). Instead, I read all of the rows that have no locks on them by using READ PAST and I store the IDs in a temp table to control what I work on and delete, etc.

    Jeff -

    Thanks for mentioning this. It prompted me to think about how the existence of INSERT/UPDATE transactions that have been initiated but not committed could cause some rows to fall through the cracks of ETL processes that select rows for extraction based on the comparison of some ever-increasing column with a "cutoff value". Thanks for the suggesting your method - I'll be looking into it.

    Jason

    UPDATEs could be missed but that's an entirely different ballgame and would require a different approach than "less than or equal to IDENTITY". For INSERTs, I do not think they will fall through the cracks due to ACID properties.

    I like Jeff's technique for the scenario he mentioned, high speed staging tables fed by triggers on multiple other tables, but it could be deemed over-engineered in some environments. I have added it to my toolkit though, thanks Jeff!

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato