Using the Konesans File Watcher Task in SSIS to Process Data Files

  • thanks for the explanation.

    I've tried setting up my project like your sample pic, but I only looped for 5 times. However I get an error when I put two or more files into the Drop path. So if I put file1 and file2 into the folder after trying to execute the file2 it will throw the error message at "Move Excel to Processing Folder" that file1 couldn't be found (which has already been moved to the Archive-folder) and nothing is written to the DB.

    In the for-each loop I put two Expressions: one for Directory = @[User::DropPath] and Name = @[User::FileName]. Without this, the for-each loop wouldn't trigger the inner components.

    So for example if I set up an infinite loop and one error occurs for some file nothing is written to the DB? Not nice 😉

    One more thing:

    Can you give an example how to trigger the right package depending on a certain string value? How do I set this up, where does the expression go?

    Ideas how to resolve these issues are appreciated.

    thanks.

  • tonyclifton77 (8/21/2012)


    thanks for the explanation.

    I've tried setting up my project like your sample pic, but I only looped for 5 times. However I get an error when I put two or more files into the Drop path. So if I put file1 and file2 into the folder after trying to execute the file2 it will throw the error message at "Move Excel to Processing Folder" that file1 couldn't be found (which has already been moved to the Archive-folder) and nothing is written to the DB.

    In the for-each loop I put two Expressions: one for Directory = @[User::DropPath] and Name = @[User::FileName]. Without this, the for-each loop wouldn't trigger the inner components.

    Check out sqlshare.com. They have a great set of videos for learning SSIS. Here is a good one on the For Each Loop container that would be worth your time:

    Looping Through and Loading Files with SSIS by Brian Knight

    So for example if I set up an infinite loop and one error occurs for some file nothing is written to the DB? Not nice 😉

    If you want your package to continue after an error occurs in your For Each Loop Container you'll need to set up your package to handle the error gracefully and carry on. Lookup Event Handling in SSIS (good start) for more information. It is a big topic and spans more than just error handling. Event Handling is a topic a lot of people ignore in SSIS because it is not front-and-center in the GUI and is not needed to get started doing small quick-and-dirty tasks but before long it becomes critical and in my opinion should be learned early and used often. You'll be happy you did when you can quickly debug your packages if something goes wrong, especially when they're in production running unattended.

    Can you give an example how to trigger the right package depending on a certain string value? How do I set this up, where does the expression go?

    Not sure what you mean by this one. I thought you were picking up your package name from a control table? Precedence Constraints may be what you want. I linked to them in my earlier response to you. You can double-click any of the connecting green lines to open the constraint editor.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Hello again,

    I read some articles, watched videos but still there are two things I'm struggling with at the moment:

    a) like I explained earlier, handling two or more files coming in at once, throws the above mentioned error at the "Move Excel to Processing Dir". I tried different settings but couldn't resolve this issue.

    b) Handling the extensions: xls, xlsm, xlsx. If I set the variable FileExtension to "*" or "xls" nothing happens for the first and only xls files get processed with the second. This is at the EventWatcher level. And in the ForEach-Loop do I also need to assign this FileExtension variable somewhere? Since I only have put .xlsx in the extension field.

    not sure what you mean by this one. I thought you were picking up your package name from a control table? Precedence Constraints may be what you want. I linked to them in my earlier response to you. You can double-click any of the connecting green lines to open the constraint editor.

    Since I couldn't use a DB lookup to get to the control table, I had to use a webservice from which I received the the needed string values. So I got that working.

    I hope you can help me to get this working.

    Thanks in advance.

  • I cannot see what you see so cannot really comment on specific tweaks to make to your settings. I would say keep plugging away because it sounds like you're close. If you want to post your code as an attachment to this thread I'll be happy to have a look.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • I now understand why the error occurs. I don't have the for-each loop set up correctly to process multiple files at once. It won't move to the next file and wants to process the same file again and of course it is not there anymore. So one file at a time is fine.

    Here is my setup:

    In the tutorials there "always" is a dataflow task with the Excel source in it but here I don't have this.

    Do I need that or can this be configured in a different way, within the For-Each loop maybe?

    and would you be so kind and check that multiple file extensions work with your "normal" event watcher tutorial.

    thank you

  • tonyclifton77 (8/24/2012)


    I now understand why the error occurs. I don't have the for-each loop set up correctly to process multiple files at once. It won't move to the next file and wants to process the same file again and of course it is not there anymore. So one file at a time is fine.

    Here is my setup:

    In the tutorials there "always" is a dataflow task with the Excel source in it but here I don't have this.

    Do I need that or can this be configured in a different way, within the For-Each loop maybe?

    There is no requirement to add a Data Flow Task.

    In your For Each Loop settings you should have the container mapping the file names it finds into a variable and then using that variable for your other operations.

    and would you be so kind and check that multiple file extensions work with your "normal" event watcher tutorial.

    You can change the WQL from using = to using LIKE when comparing TargetInstance.Extension, e.g. TargetInstance.Extension LIKE 'xls%'. In your For Each Loop you'll use something like *.xls, *.xls* or *.*

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • HI I am using file watcher task to watch files in folder. I want to take file names which files are newly added to the folder. In FWT we have "outputvariablename" property is there but it's returning only one file name only, if u add number of files. I want all those file name. How can i get the file names. Please help me in this manner.

    Thanks In Advance

    Vijay

  • parvathaneni.vijay (1/7/2013)


    HI I am using file watcher task to watch files in folder. I want to take file names which files are newly added to the folder. In FWT we have "outputvariablename" property is there but it's returning only one file name only, if u add number of files. I want all those file name. How can i get the file names. Please help me in this manner.

    Thanks In Advance

    Vijay

    The File Watcher Task recognizes the first file that arrives and immediately fires the event that causes your processing logic to execute synchronously. At that point, the File Watcher Task is no longer watching the file system, i.e. there is no support for multiple files arriving.

    If you want to process multiple files one way to do that is to put the FWT into a For Loop Container where the criteria is 1=1, i.e. it loops indefinitely. On the FWT set the FindExistingFiles property to True. Consider what this will accomplish when multiple files arrive:

    1. Package starts, For Loop Container (FLC) starts, File Watcher Task (FWT) begins executing and waits for new files to arrive.

    2. Two files arrive in sequence. The FWT recognizes the first file only, and it cedes control to the next Task in the FLC allowing it to be processed by the rest of the Tasks in the FLC.

    3. All Tasks in the FLC complete successfully, i.e. the first file is completely processed. The FLC loops and we return the to FWT.

    4. The FWT is executing again and because it has its FindExistingFiles property set to True it finds the second file that arrived. As it did with the first file it cedes control to the next Task in the FLC allowing it to be processed by the rest of the Tasks in the FLC.

    5. This process will continue until all files that have arrived are processed.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Excuse my ignorance (I don't do masses with SSIS other than moving data overnight from one database to another), but what is wrong with this....

    1. start a package that checks for the existence of a file

    2. Use 'For Loop Container' (while 'file_exists = 0')

    2a. Use a script task to cause the process to wait for 10 seconds

    2b. if the file is there load it and set 'file_exists = 1'

    Presumably there is some way of checking that the file is completely downloaded before processing?

    Obviously put some error handling and maybe a time limit so that it exits after a certain time.

    Simple and no 3rd party tool required.

    Pretty sure something like that worked perfectly for me 12 years ago in a SQL Server 2000 dts package.

  • From my perspective the 3rd party tool simplifies the development. Instead of having to create the script task and loops for each file, you use the File Watcher task.

    I was using the Konesans File Watcher for over a year without a problem, then windows updates in March 2014 introduced an interesting bug. Occasionally the File Watcher would just terminate the SSIS package from running. No errors were ever raises, so you couldn't gracefully handle the error within SSIS. We were dependent on the SQL job agent to report the failure.

    We ended up pulling the file watcher out, and using code similar to what you listed.

  • @colin and @jeff,

    This was article 2 of 3 that I wrote on this topic. Article 3 offers a solution to the problem case using only a Script Task. At the end I compare all three solutions.

    Using the Script Task in SSIS to Process Data Files When They Arrive[/url]

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Thanks....

    At least I know I'm not going mad....:-)

  • Why not just write a script task to do the same thing? It should only be a few lines of code.

    Gerald Britton, Pluralsight courses

  • g.britton (10/12/2014)


    Why not just write a script task to do the same thing? It should only be a few lines of code.

    It's a bit more than a few when you start to consider waiting for exclusive access to the file, and things like adding a timeout so you're not waiting for the file indefinitely but yeah, a Script Task is an option. See my last post a few before this one.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Its a bit more than a few, but just a bit! Its actually a pity we can't use F# for scripting (yet!) It would be substantially less code.

    The other thing is that if your company has a problem with third-party components (mine has a lengthy (months!) acceptance process), roll-your-own is an attractive option

Viewing 15 posts - 16 through 29 (of 29 total)

You must be logged in to reply to this topic. Login to reply