Using the WMI Event Watcher Task in SSIS to Process Data Files

  • Thanks, yeah, I seem to be having trouble with the WITHIN statement, but it could be permissions on my end, theough I have tried setting control of the namespace to everyone and still no joy. I was going to use this as a proof of concept for a project I am currently working on, but finding it difficult to justify if i can't get a simple demo working. 🙁

    I have just finally got it working with hard coded values, so will work from there and see if I can get it working with the variables, looks like it may have well been security, will let you know. 🙂

    Thanks

  • WMI tends to be a bit trickier to get working than some other methods due to security and the syntax of the WQL.

    This method, which uses a .NET class, is a little easier to implement in my opinion:

    Using the Script Task in SSIS to Process Data Files When They Arrive[/url]

    See the end of the article where I contrast the three methods.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Thank you, I will have a read up on this as it seems to be everytime I add a variable into the WQL it decides to stop working, which is strange. Will read up on this tomorrow and see if it will be a better way to do this. 🙂

  • Orlando, this guide is pretty amazing. It's exactly what I was looking for to handle random data dumps from another team that needed to be imported to a table for reporting purposes.

    I had the file working locally. The SSIS package is stored in a source control (with a copy in a network location) so a SQL job can execute it once every 6 hours (with a 5 minute timeout on the WMI Event Watcher Task).

    When I run the job, which uses the admin user on the server where the copy of the SSIS package is located, the Move Data File To Processing Folder fails. I'll simulate the process with a mock file using the same Name.YYYYMMDD.xlsx name format. But the event, for whatever reason, doesn't look for the wildecard, Name.*.xlsx, in the directory. It gets the directory right, but wants the actual dummy file Name.YYYYMMDD.xlsx. I even tested this theory and it works. The package runs perfectly if I put Name.YYYYMMDD.xlsx in the folder being watched.

    The error given is as follows:

    Move Data File To Processing Directory File System Task Description: An error occurred with the following error message: "Could not find file '\\[servername]\IMPORT\Copilot\Copilot Online Support Data\AssociateData.YYYYMMDD.xlsx'

    For whatever reason it wants that actual file name. It won't search for the wild card. Oddly enough, the WMI Even Watcher Task works; when I put any file matching the AssociateData.YYYYMMDD.xlsx format, the watcher even finishes and the package proceeds to the Move Data File To Processing Folder task, and then fails because it can't fine AssociateData.YYYYMMDD.xlsx.

    Any help would be greatly appreciated. I've tried messing with the move data file task, trying to get it to use the FileMask or something instead but that causes other issues.

    Edit: Instead of using User::FileName to create the DropFile variable, I tried User::FileMask. The move Data File Task gives this error now:

    An error occurred with the following error message: "Illegal characters in path"

    I'm guessing it doesn't like the *

  • jeanna_mcmahon (8/8/2013)


    Orlando, this guide is pretty amazing. It's exactly what I was looking for to handle random data dumps from another team that needed to be imported to a table for reporting purposes.

    I had the file working locally. The SSIS package is stored in a source control (with a copy in a network location) so a SQL job can execute it once every 6 hours (with a 5 minute timeout on the WMI Event Watcher Task).

    When I run the job, which uses the admin user on the server where the copy of the SSIS package is located, the Move Data File To Processing Folder fails. I'll simulate the process with a mock file using the same Name.YYYYMMDD.xlsx name format. But the event, for whatever reason, doesn't look for the wildecard, Name.*.xlsx, in the directory. It gets the directory right, but wants the actual dummy file Name.YYYYMMDD.xlsx. I even tested this theory and it works. The package runs perfectly if I put Name.YYYYMMDD.xlsx in the folder being watched.

    The error given is as follows:

    Move Data File To Processing Directory File System Task Description: An error occurred with the following error message: "Could not find file '\\[servername]\IMPORT\Copilot\Copilot Online Support Data\AssociateData.YYYYMMDD.xlsx'

    For whatever reason it wants that actual file name. It won't search for the wild card. Oddly enough, the WMI Even Watcher Task works; when I put any file matching the AssociateData.YYYYMMDD.xlsx format, the watcher even finishes and the package proceeds to the Move Data File To Processing Folder task, and then fails because it can't fine AssociateData.YYYYMMDD.xlsx.

    Any help would be greatly appreciated. I've tried messing with the move data file task, trying to get it to use the FileMask or something instead but that causes other issues.

    Edit: Instead of using User::FileName to create the DropFile variable, I tried User::FileMask. The move Data File Task gives this error now:

    An error occurred with the following error message: "Illegal characters in path"

    I'm guessing it doesn't like the *

    Thanks for the kind feedback Jeanna. Sorry for the delay. My wife and I welcomed our first child, a son, 10 weeks ago and it has been all I can do to keep up with work and family priorities.

    It sounds like you are having trouble getting the Package to recognize the actual file name after the WMI Event Watcher fires. This tells me that it's possible you are missing the Event Handler on the WMI Event Watcher that waits for exclusive access and does the file name resolution for you. Let me ask you, did you build the Package by hand while following along with the article or did you start from the version of the Package that is included as an attachment to the article?

    On the Control Flow, if you click on the WMI Event Watcher Task to give it focus, then Click on the Event Handlers tab, hopefully VS will drop you into the WMIEventWatcherEventOccurred event and you should have a Script Task there named "Check File Availability." This Script Task is the piece of code that will look in the drop directory and resolve the file name for the found file, as well as set the value of variable User::FileName. Please let me know how you get along after checking those things.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Hi I have some questions which i have not found the answer in your post Please suggest.

    I have create the same package step by step for the text file instead of the xlsx file.

    But when I am copying the text file to ExcelDrop Folder, some time WMI event is executing or some time not. What is the reason behind that. Is WMI will execute only, when the Newly created files in folder. If we rename some old file, It will not execute for that ?

    How the WMI event will check for creating the new file 24x7. Because when I am executing the package, WMI is showing in the Yellow status. Now WMI is checking for the First File Created in the folder with Name BusinessData.YYYYMMDD.txt. And Excute all the steps and comes into the green status. Now If again copied/created the same file BusinessData.YYYYMMDD.txt to the folder. it is not identifying for the new file. Please suggest how it will work for checking the new file 24x7.

    And one more thing I want to know about the File Name. How it will identify if the file name is change on the date basis like BusinessData.20131031.txt or BusinessData.20131030.txt etc. What we need to do for that.

  • nitin.varshney (10/31/2013)


    Hi I have some questions which i have not found the answer in your post Please suggest.

    I have create the same package step by step for the text file instead of the xlsx file.

    But when I am copying the text file to ExcelDrop Folder, some time WMI event is executing or some time not. What is the reason behind that. Is WMI will execute only, when the Newly created files in folder. If we rename some old file, It will not execute for that ?

    How the WMI event will check for creating the new file 24x7. Because when I am executing the package, WMI is showing in the Yellow status. Now WMI is checking for the First File Created in the folder with Name BusinessData.YYYYMMDD.txt. And Excute all the steps and comes into the green status. Now If again copied/created the same file BusinessData.YYYYMMDD.txt to the folder. it is not identifying for the new file. Please suggest how it will work for checking the new file 24x7.

    And one more thing I want to know about the File Name. How it will identify if the file name is change on the date basis like BusinessData.20131031.txt or BusinessData.20131030.txt etc. What we need to do for that.

    Yes...I was also having the same situation....

  • The WMI Event Watcher can present us with some seemingly unpredictable behavior. It is true that renaming a file to satisfy the mask will not cause the event to fire. The event being watched for is "file created" not "file renamed", so keep in mind the exact nature of the event being watched for. That said, even adding new files to the directory, depending on how you do it, can sometimes yield no results. Be deliberate in your testing and make sure you're actually creating a new file in the directory.

    I also developed a method using the .NET FileSystemWatcher class whose behavior is much more predictable in my opinion. While the WMI Event Watcher method is correctly implemented, it is not as user friendly or easy to test.

    Using the Script Task in SSIS to Process Data Files When They Arrive[/url]

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • The WMI query uses a wildcard so it will find files named with a datestamp. See LIKE 'BusinessData%'

    SELECT *

    FROM __InstanceCreationEvent WITHIN 5

    WHERE TargetInstance ISA 'CIM_DataFile'

    AND TargetInstance.Drive = 'C:'

    AND TargetInstance.Path = '\\@\\ExcelDrop\\'

    AND TargetInstance.FileName LIKE 'BusinessData%'

    AND TargetInstance.Extension = 'xlsx'

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • good Article,

    I do not think the WMI Event handler is very reliable. I had it working for a remote UNC Fileshare but it often stopped listening without a reason. There are suggestions that there are limits to the amount of messages it can digest without a server reboot. Ultimately I replaced it in my package with a script task that uses System.IO and queries for File arrival with a timer.

  • bart.de.koning (9/1/2014)


    good Article,

    I do not think the WMI Event handler is very reliable. I had it working for a remote UNC Fileshare but it often stopped listening without a reason. There are suggestions that there are limits to the amount of messages it can digest without a server reboot. Ultimately I replaced it in my package with a script task that uses System.IO and queries for File arrival with a timer.

    Thanks.

    I have heard reports of the WMI technique being a bit inconsistent at times. I think a lot of the overall performance level will depend on the environment, e.g. host OS, network quality, etc.

    I did three articles on this topic, this one leveraging WMI directly being the first, and each building on the last. Here is the third one. At the end of the article I contrast the three techniques:

    Using the Script Task in SSIS to Process Data Files When They Arrive[/url]

    I have heard reports of even the FileSystemWatcher (used in the article above) behaving inconsistently when it comes to watching an SMB-NAS directory for a file. I have a mind to write a fourth article that implements a Script Task containing a simple for-loop with a check for a file, a Thread.Sleep and some logic around the timeout settings but I am not sure when, if ever, I will get around to writing it.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • I know this was originally published a while back but I just read through it; it solves a problem I have been struggling with.

    Very well written. Clear and to the point. This is an excellent article Orlando; 5 stars.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Great article. Thanks! What if your "drop folder" is an (s)ftp site? Do you know if it is possible to test whether a file available on a (s)ftp site is done being written to?

  • Alan.B (10/3/2014)


    I know this was originally published a while back but I just read through it; it solves a problem I have been struggling with.

    Very well written. Clear and to the point. This is an excellent article Orlando; 5 stars.

    Thanks Alan! I really appreciate the feedback.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • cjmorelock (10/3/2014)


    Great article. Thanks! What if your "drop folder" is an (s)ftp site? Do you know if it is possible to test whether a file available on a (s)ftp site is done being written to?

    Thanks for the feedback!

    I think you are asking about the case when your process is watching a local network or server folder, and that folder is exposed to accept file uploads via an SFTP site running in your environment. If that assertion is correct, then it can present some challenges because if memory serves, the SFTP protocol operates in chunks of data yet not all SFTP service implementations maintain an open file handle in between chunks being written to the file, i.e. the SFTP service repeatedly opens the target file for writing, appends a chunk, then closes the file handle, then accepts another chunk from the client session, etc.

    In this case you would need to customize the code in the article to do an additional check after you can initially attain exclusive access to the file which checks the file size, waits maybe 15 seconds, then checks the file size again to ensure the file is not still being written to from an SFTP client session, i.e. did not grow within those 15 seconds. If it did grow, then loop again before checking for exclusive access and file growth.

    The technique you mentioned can pose problems because in truth, you never really know when the data provider is done writing to the data file. Even if you wait a minute and the data file stopped growing, what if their upload process failed for some reason and only a partial file has been uploaded? You would not have any way to know, especially if the partial file unluckily happened to end up structurally valid according to a user or ETL process that consumes it.

    Many data consumers will negotiate a "contract" with their data providers where the provider will upload a small control file after they upload their data file. The control file acts as a signal to the data consumer that the data file is ready for download. The data exchange looks like this, in sequence:

    1. data provider uploads data file

    2. data provider uploads control file

    3. data consumer polls or watches for control file

    4. data consumer finds control file

    5. data consumer downloads data file

    The control file is usually empty, or only a few bytes with a simple message inside or some metadata about the data file. The control file exists to help the data consumer to know when the data file is ready, and maybe know some details about the data file for example byte count, number of lines, maybe a checksum, etc.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

Viewing 15 posts - 31 through 45 (of 50 total)

You must be logged in to reply to this topic. Login to reply