load five sources into five destinations

  • Hi SSIS Gurus

    I have 5 flat file Sources in a folder and i have to load this sources into five destinations.but condition is when the source data is updated automatically destination data will be update Previous data would not be added how we can do.

    Thanks

    Venki

  • do u mean u have five flat files corresponding to five tables into which u want to load the data?

    If yes, do u also mean that u're constantly adding records to the flat files and you want incremental data to be uploaded?



    Pradeep Singh

  • So the sources are being updated continuously? Or just from time to time? You may encounter some locking problems.

    One approach could be to move the source file to another folder, import it and then archive it - at least then you know that nothing will write to it while it is being processed, and that you will never process the same thing more than once. Hopefully, whatever is creating the source data will just recreate the file the next time it needs to write data.

    You then need a way of monitoring the source folder for new data - in the example above, that just means new files. You could either just schedule the SSIS package to run frequently (exiting gracefully if there are no files to process) or write a Windows service to monitor the folder and run the package in (almost) real time when new files are created.

    Phil

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Phil Parkin (12/22/2008)


    So the sources are being updated continuously? Or just from time to time? You may encounter some locking problems.

    One approach could be to move the source file to another folder, import it and then archive it - at least then you know that nothing will write to it while it is being processed, and that you will never process the same thing more than once. Hopefully, whatever is creating the source data will just recreate the file the next time it needs to write data.

    You then need a way of monitoring the source folder for new data - in the example above, that just means new files. You could either just schedule the SSIS package to run frequently (exiting gracefully if there are no files to process) or write a Windows service to monitor the folder and run the package in (almost) real time when new files are created.

    Phil

    You can monitor folder for changes using the File Watcher Task.

    ---
    SSIS Tasks Components Scripts Services | http://www.cozyroc.com/

  • Great tip - I did not know about that ... looks perfect for this.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • I like the idea of moving the files and we currently use the file watcher in one of our ETL's, works great. If you can't move the files, here's a couple options, both require either a timestamp or a unique key.

    If the key is available you can re-read the file each time and only add records where the key does not exist in the destination. Do this by staging the flat file to a temp table then use a merge join set up as a left join.

    Join the new data(left side of join) to existing data (right side of join) just include 1 piece of key data from the right side to check for null. Use a conditional split and only output data where the piece of right key data is null.

    If the data is timestamped then do as above and stage the entire flat file. Check the destination data for the greatest time stamp value and select all records that have a timestamp > than the max of the destination. This may be prone to errors depending on the accuracy of the timestamp and frequency which records are added to the flat file. A way around this would be to select records from the staged data using a range, the low end would be > max of the destination and the high would be the max of the staged less 1 of the smallest time unit. So if are dealing with seconds take the max stage less 1 second, that way you know all entries with that time value are in the stage table. hope that made sense.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply