How to handle the a delete scenario in the following case?

  • Hi All,

    Here is an interesting and a challenging scenario that really gave us one hell of time during development.

    The scenario:

    An SSIS package will load data from a csv file initially, this is going to be a full load. Further, there will be only incremental load whenever the same file arrives, the same table needs to be populated.

    After the full load, data has to be inserted, updated or deleted. The file will therefore be provided with say 'n' number of records during the second time load and that will only be a new row or an update on the existing row. The rows deleted will not be coming during the second load(as they are deleted and we wouldnt know which one).

    How to handle such deleted records in the incremental load without altering the structure of the table/file?

    I do not wanna include a flag to notify if the row was deleted or not, is there any other option than using a flag?

  • Use a merge command, load the incremental data into a staging table the use merge to insert update or delete based on the unique column(s) of the row.

  • Personally I would have a preload Table that gets truncated at the start of every load, then once the file is loaded into this table its is merged into the main table.

    That way you can handle the Insert, Update and Delete in one single statement.

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

  • I do agree, but the problem is with handling the delete scenario. The second load will only give me new/updated records on the full load happened initially. So deleted records wont come in my file at all. That doesnt mean that except new/updated records all were deleted.

  • If your unsure which ones where deleted, then you cant delete them, ask for a full export every time, not just an incremental update, then you can just upload the new or changed data, and you can see which ones are no longer in the export so you can delete them.

  • How about after initial configuration of your excel with all data, from next time onwards first transfer the required data to temp table and then transfer from this temp to excel.

    ----------
    Ashish

  • Thats a tricky one, as anthony says youll probably need a full extract to do the deletes.

    Out of curiosity how many rows of data are we considering here in the Full extract vs the partial file.

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

  • Initially there will be data more than 2Lakhs, out of which the changes might be on few thousands during the next load.

  • ambarishan.purushothaman (12/17/2012)


    Initially there will be data more than 2Lakhs, out of which the changes might be on few thousands during the next load.

    Untill you do not get information about the deleted records you cant delete it.

    You have to ask the team providing you load excel to provide all records with a column which indicates the deletedew\existing record or something like this to identify.

  • Thank you all guys... That was indeed quick and useful...

  • ambarishan.purushothaman (12/17/2012)


    Initially there will be data more than 2Lakhs, out of which the changes might be on few thousands during the next load.

    So we're talking about 200,000 rows of data, which isnt a lot of data (unless its very wide), and SSIS should be able to cope with that in a very small window.

    What are you NFR's for the file load and how frequently does it occur?

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply