Using the Script Task in SSIS to Process Data Files When They Arrive

  • For anyone trying to convert this into SQL 2008 R2, it seems the SSISScriptTaskEntryPointAttribute does not exist in the ScriptTask namespace. I am trying to figure out at the moment how to get around this and will let you know what to use instead once I have (unless someone already knows and can tell me before I go mad). 🙂

    That and setting the Expression on the Excel connection seem to be the only sticking points at the moment.

  • Rick, Did you have any luck with the SSISScriptTaskEntryPointAttribute in SQL 2008 R2?

  • aka88 (11/16/2012)


    Rick, Did you have any luck with the SSISScriptTaskEntryPointAttribute in SQL 2008 R2?

    No, no luck with that. I have instead used a WMI event task that runs a job, this then runs a SSIS package with a foreach container. I am still having issues with the scripting task though as using the code from the original, it seems to ignore whether the file is locked or not.

    I am using the code from http://www.sqlservercentral.com/articles/Integration+Services+(SSIS)/90571/ and trying to work out exactly what is going wrong. I should have it finished by the end of today if I can work it out. If not, I may just set a few retry attempts on the job instead as I need to get this finished and I can't envision a file being locked for more than a few minutes max.

  • To Rick-153145 & aka88,

    Sorry for the delay guys. I would have posted this code sooner but my 2008 R2 environment needed to be rebuilt and I just got around to it.

    Here is Script Task code that works in SSIS 2008 R2 to accomplish the task specified in my article. The code itself is no different, only the layout of the class per some changes to the SSIS scripting engine between versions 2008 R2 and 2012.

    [font="Courier New"]

    #region Help:  Introduction to the script task

    /* The Script Task allows you to perform virtually any operation that can be accomplished in

    * a .Net application within the context of an Integration Services control flow.

    *

    * Expand the other regions which have "Help" prefixes for examples of specific ways to use

    * Integration Services features within this script task. */

    #endregion

    #region Namespaces

    using System;

    using System.Data;

    using Microsoft.SqlServer.Dts.Runtime;

    using System.Windows.Forms;

    using System.IO;

    using System.Threading;

    #endregion

    namespace SqlServerCentral

    {

        [System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]

        public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase

        {

            #region VSTA generated code

            enum ScriptResults

            {

                Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,

                Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure

            };

            #endregion

            #region Help:  Using Integration Services variables and parameters in a script

            /* To use a variable in this script, first ensure that the variable has been added to

             * either the list contained in the ReadOnlyVariables property or the list contained in

             * the ReadWriteVariables property of this script task, according to whether or not your

             * code needs to write to the variable.  To add the variable, save this script, close this instance of

             * Visual Studio, and update the ReadOnlyVariables and

             * ReadWriteVariables properties in the Script Transformation Editor window.

             * To use a parameter in this script, follow the same steps. Parameters are always read-only.

             *

             * Example of reading from a variable:

             *  DateTime startTime = (DateTime) Dts.Variables["System::StartTime"].Value;

             *

             * Example of writing to a variable:

             *  Dts.Variables["User::myStringVariable"].Value = "new value";

             *

             * Example of reading from a package parameter:

             *  int batchId = (int) Dts.Variables["$Package::batchId"].Value;

             *  

             * Example of reading from a project parameter:

             *  int batchId = (int) Dts.Variables["$Project::batchId"].Value;

             *

             * Example of reading from a sensitive project parameter:

             *  int batchId = (int) Dts.Variables["$Project::batchId"].GetSensitiveValue();

             * */

            #endregion

            #region Help:  Firing Integration Services events from a script

            /* This script task can fire events for logging purposes.

             *

             * Example of firing an error event:

             *  Dts.Events.FireError(18, "Process Values", "Bad value", "", 0);

             *

             * Example of firing an information event:

             *  Dts.Events.FireInformation(3, "Process Values", "Processing has started", "", 0, ref fireAgain)

             *

             * Example of firing a warning event:

             *  Dts.Events.FireWarning(14, "Process Values", "No values received for input", "", 0);

             * */

            #endregion

            #region Help:  Using Integration Services connection managers in a script

            /* Some types of connection managers can be used in this script task.  See the topic

             * "Working with Connection Managers Programatically" for details.

             *

             * Example of using an ADO.Net connection manager:

             *  object rawConnection = Dts.Connections["Sales DB"].AcquireConnection(Dts.Transaction);

             *  SqlConnection myADONETConnection = (SqlConnection)rawConnection;

             *  //Use the connection in some code here, then release the connection

             *  Dts.Connections["Sales DB"].ReleaseConnection(rawConnection);

             *

             * Example of using a File connection manager

             *   object rawConnection = Dts.Connections["Prices.zip"].AcquireConnection(Dts.Transaction);

             *  string filePath = (string)rawConnection;

             *  //Use the connection in some code here, then release the connection

             *  Dts.Connections["Prices.zip"].ReleaseConnection(rawConnection);

             * */

            #endregion

            #region Instance variables

            // we need access to the found file info from the FileSystemWatcher OnFileCreate event in our class

            // scope. an instance variable may look odd but will do for our purposes

            private FileInfo foundFile = null;

            #endregion

            #region Method: void Main()

            /// <summary>

            /// This method is called when this script task executes in the control flow.

            /// Before returning from this method, set the value of Dts.TaskResult to indicate success or failure.

            /// To open Help, press F1.

            /// </summary>

            public void Main()

            {

                try

                {

                    // initialize common variables from DTS variables collection

                    string dropDirectory = Dts.Variables["User::WatcherInputDropPath"].Value.ToString();

                    string fileMask = Dts.Variables["User::WatcherInputFileMask"].Value.ToString();

                    bool includeSubdirectories = Convert.ToBoolean(Dts.Variables["User::WatcherInputIncludeSubdirectories"].Value);

                    // look for existing files if configuration suggests we should

                    bool findExistingFiles = Convert.ToBoolean(Dts.Variables["User::WatcherInputFindExistingFiles"].Value);

                    if (findExistingFiles)

                    {

                        FindExistingFile(dropDirectory, fileMask, includeSubdirectories);

                    }

                    // do we (still) need to look for a file?

                    if (foundFile == null)

                    {

                        // if we made it here there were no existing files to process (or we didn't check for them per the

                        // configuration variables) so setup a FileSystemWatcher object per the configuration variables

                        bool timeoutAsWarning = Convert.ToBoolean(Dts.Variables["User::WatcherInputTimeoutAsWarning"].Value);

                        int timeoutSeconds = Convert.ToInt32(Dts.Variables["User::WatcherInputTimeoutSeconds"].Value);

                        int timeoutMilliseconds = (timeoutSeconds == 0 ? -1 : timeoutSeconds * 1000);

                        WatchForFileCreation(dropDirectory, fileMask, includeSubdirectories, timeoutAsWarning, timeoutMilliseconds);

                    }

                    Dts.TaskResult = (int)ScriptResults.Success;

                }

                catch (Exception e)

                {

                    Dts.Events.FireError(0, null, e.Message, string.Empty, 0);

                    Dts.TaskResult = (int)ScriptResults.Failure;

                }

            }

            #endregion

            #region Event: void OnFileCreate(object source, FileSystemEventArgs e)

            /// <summary>

            /// Event attached to FileSystemWatcher when a file is created.

            /// </summary>

            /// <param name="source">Event source.</param>

            /// <param name="e">Event arguments.</param>

            private void OnFileCreate(object source, FileSystemEventArgs e)

            {

                PreProcessFoundFile(new FileInfo(e.FullPath));

            }

            #endregion

            #region Method: WatchForFileCreation

            /// <summary>

            /// Sets up a FileSystemWatcher to watch for new files being created.

            /// </summary>

            /// <param name="dropDirectory">Directory to watch</param>

            /// <param name="fileMask">File pattern mask of files being watched for.</param>

            /// <param name="includeSubdirectories">If true all subdirectories are also watched.</param>

            /// <param name="timeoutAsWarning">If true then if watcher times out only a warning is raised, i.e. the Task succeeds.</param>

            /// <param name="timeoutMilliseconds">Number of milliseconds to wait for a file to be initially created. This timeout period

            /// does not apply to the tiem spent waiting for exclusive access to be gained to the file.</param>

            private void WatchForFileCreation(string dropDirectory,

                string fileMask,

                bool includeSubdirectories,

                bool timeoutAsWarning,

                int timeoutMilliseconds)

            {

                // create a new FileSystemWatcher

                FileSystemWatcher fileSystemWatcher = new FileSystemWatcher();

                // set the path to watch to our 'drop directory'

                fileSystemWatcher.Path = dropDirectory;

                // set the option to watch subdirectories

                fileSystemWatcher.IncludeSubdirectories = includeSubdirectories;

                // set the filter of files to watch for to our 'file mask'

                fileSystemWatcher.Filter = fileMask;

                // add event handler to execute when new files are created

                fileSystemWatcher.Created += new FileSystemEventHandler(OnFileCreate);

                // begin watching

                fileSystemWatcher.WaitForChanged(WatcherChangeTypes.Created, timeoutMilliseconds);

                if (foundFile == null)

                {

                    // the file watcher timed out waiting for a file  :-<

                    string message = "Timeout waiting for file {Path='" + dropDirectory + "'; Filter='" + fileMask +

                         "'; IncludeSubdirectories=" + includeSubdirectories.ToString() + "}.";

                    if (timeoutAsWarning)

                    {

                        // only raise a warning

                        Dts.Events.FireWarning(0, null, message, string.Empty, 0);

                    }

                    else

                    {

                        // raise an error

                        throw new TimeoutException(message);

                    }

                }

            }

            #endregion

            #region Method: void PreProcessFoundFile(FileInfo dataFile)

            /// <summary>

            /// Takes actions subsequent to locating a file that allow later processing of the file. This method

            /// reports information to the parent container by firing info events. This method also ensures exclusive

            /// access to the file can be achieved before returning control to the parent container.

            /// </summary>

            /// <param name="dataFile">File to preprocess.</param>

            private void PreProcessFoundFile(FileInfo dataFile)

            {

                // set the instance variable value to the found file

                this.foundFile = dataFile;

                // local variable to pass to events that require parameters be passed by ref

                bool fireAgain = true;

                // raise an information event saying we found a file (not necessarily that it can be used)

                Dts.Events.FireInformation(0, null, "File found: " + dataFile.FullName, string.Empty, 0, ref fireAgain);

                // We know there is a new file that can be processed because

                // the FileSystemWatcher fired an event, however we do not know if the user or process

                // supplying the file has completed uploading it. We will loop over drop directory

                // looking for files that meet our criteria and once we find one we will make sure

                // the supplier has completed their upload process by checking to see if we can gain

                // exclusive access to the file. Once we can gain exclusive access to the file we will know

                // the upload is complete and we can allow the rest of the SSIS package to continue.

                WaitForExclusiveAccess(dataFile);

                // store the full file name (includes path) in output variable

                Dts.Variables["User::WatcherOutputFileFullName"].Value = dataFile.FullName;

                // store the file name in output variable

                Dts.Variables["User::WatcherOutputFileName"].Value = dataFile.Name;

                // raise an information event saying we found a file -and- it can be used

                Dts.Events.FireInformation(0, null, "File ready for use: " + dataFile.FullName, string.Empty, 0, ref fireAgain);

            }

            #endregion

            #region Method: void WaitForExclusiveAccess(FileInfo dataFile)

            /// <summary>

            /// Waits until exclusive access to a file can be achieved.

            /// </summary>

            /// <param name="dataFile">File to access.</param>

            private void WaitForExclusiveAccess(FileInfo dataFile)

            {

                // local variable to say how many seconds to wait in between checking if we can gain

                // exclusive access to the found file

                int secondsToWaitBetweenAttempts = 5;

                // local variable to pass to events that require parameters be passed by ref

                bool fireAgain = true;

                // Loop indefinitely checking if we can access the data file.

                while (1 == 1)

                {

                    try

                    {

                        // Attempt to gain access to the file.

                        using (Stream stream = new FileStream(dataFile.FullName, FileMode.Open))

                        {

                            // If we made it here no exception was thrown meaning we

                            // could access the file. We will break out of the loop and allow

                            // the rest of the package to continue processing.

                            break;

                        }

                    }

                    catch (IOException)

                    {

                        // We are not interested in ending the program when an IOException

                        // occurs in this area. This type of exception means we could not

                        // gain access to the file.

                        // In general, programming algorithms that leverage exceptions for

                        // control flow are frowned upon. However in the case of file access

                        // it is an acceptable pattern.

                    }

                    // raise an information event saying we could not gain exclusive access to the found file and will wait

                    Dts.Events.FireInformation(0, null, "Could not gain exclusive access to file " + foundFile.FullName +

                                                               ". Waiting " + secondsToWaitBetweenAttempts.ToString() +

                                                               " seconds before trying again...", string.Empty, 0, ref fireAgain);

                    // wait some time before checking whether the file can be used

                    Thread.Sleep(secondsToWaitBetweenAttempts * 1000);

                }

            }

            #endregion

            #region Method: void FindExistingFile(string directoryName, string fileMask, bool includeSubdirectories)

            /// <summary>

            /// Check a directory for files that match a file mask.

            /// </summary>

            /// <param name="directoryName">Directory to look for files.</param>

            /// <param name="fileMask">File pattern mask matching files to look for.</param>

            /// <param name="includeSubdirectories">True if subdirectories should also be checked.</param>

            private void FindExistingFile(string directoryName, string fileMask, bool includeSubdirectories)

            {

                // local variable to pass to events that require parameters be passed by ref

                bool fireAgain = true;

                // get the list of files that qualify

                DirectoryInfo directoryInfo = new DirectoryInfo(directoryName);

                FileInfo[] fileInfos;

                if (includeSubdirectories)

                { fileInfos = directoryInfo.GetFiles(fileMask, SearchOption.AllDirectories); }

                else

                { fileInfos = directoryInfo.GetFiles(fileMask, SearchOption.TopDirectoryOnly); }

                // check to see if any files were found

                if (fileInfos.Length > 0)

                {

                    // found a file!

                    PreProcessFoundFile(fileInfos[0]);

                    // raise an info message

                    Dts.Events.FireInformation(0, null, "Existing files found: " + fileInfos.Length.ToString(), string.Empty, 0, ref fireAgain);

                }

                else

                {

                    // no files found, raise a warning

                    Dts.Events.FireWarning(0, null, "No existing files found.", string.Empty, 0);

                }

            }

            #endregion

        }

    }[/font]

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • mishaluba (10/27/2012)


    Thank you very much for the great article! This certainly goes into the "briefcase". I have one question, which you sort of answered in one of the other comments, but I would like to make sure. How can we make this SSIS package actually run continuously? You mentioned placing the entire code inside the loop WHILE 1=1. So guess we would create a job executing the package, start it and it would run forever. Are there other methods?

    Thank you for the kind feedback. Using a loop is the traditional method to have an SSIS package run indefinitely. In SSIS a WHILE loop that runs indefinitely can be implemented using the For Loop Container. You'll just want to be careful to ensure that inside the loop there is a task that waits for some event to occur that does not use a lot of CPU, e.g. a Script Task that uses the FileSystemWatcher class per the implementation in the article, so the SSIS package does not spike the CPU to 100%.

    edit: spelling

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Nice article.

  • Useful article!

  • Very helpfull article, I've used it to build an ETL import for multiple files, and I've encountered a strange behavior.

    I've set up an excel template (.xltx) in the "Drop" folder that the end user should use. When the template is opened, data should be added and then the file should be saved as a normal .xlsx file. What happens now is that I save the file directly in the drop folder, I see the file appearing inthe file system together with the temporary ~$filename.xlsx file. The package won't and shouln't load the file at this point. Now I close the file in Excel (I leave excel open though) and while the temporary file disappears properly, the package still won't process the file. Is this something that should be solved in the C# code "WaitForExclusiveAccess" or is this a filesystem problem?

    I hope I've explained my problem clearly enough.

    Regards, Martijn

  • msandbergen (1/4/2013)


    Very helpfull article, I've used it to build an ETL import for multiple files, and I've encountered a strange behavior.

    I've set up an excel template (.xltx) in the "Drop" folder that the end user should use. When the template is opened, data should be added and then the file should be saved as a normal .xlsx file. What happens now is that I save the file directly in the drop folder, I see the file appearing inthe file system together with the temporary ~$filename.xlsx file. The package won't and shouln't load the file at this point. Now I close the file in Excel (I leave excel open though) and while the temporary file disappears properly, the package still won't process the file. Is this something that should be solved in the C# code "WaitForExclusiveAccess" or is this a filesystem problem?

    I hope I've explained my problem clearly enough.

    Regards, Martijn

    Thank you for the kind feedback. Your solution is not one I had imagined (using an Excel-template in the same directory for users to open and save an xlsx) but it is very well conceived from a workflow standpoint. Thank you for sharing it.

    When the package runs, if the C# code recognizes a file but it cannot gain exclusive access to the file it will raise an informational event saying:

    Could not gain exclusive access to file fileName. Waiting secondsToWaitBetweenAttempts seconds before trying again...

    I assume you are seeing this output even after the file has been closed in Excel. My initial thought is that it is Excel that is not releasing the file handle, even though the file has been closed within the program, and nothing to do with the Windows file system. It sounds like you can easily recreate the scenario. Once the scenario is created, what happens when you completely close Excel?

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Seems like my problem has solved itself after I restarted Excel (using 2010). Now the file is processed properly as soon as I close Excel and even when I only close the document and leave excel open .... I figure I shouldn't leave my programs open for weeks ...

    Still, what happens if I traverse through subdirectories and 1 file won't load, will it still parse other file or will it keep waiting forever?

    Thanks, Martijn

  • msandbergen (1/4/2013)


    Seems like my problem has solved itself after I restarted Excel (using 2010). Now the file is processed properly as soon as I close Excel and even when I only close the document and leave excel open .... I figure I shouldn't leave my programs open for weeks ...

    Still, what happens if I traverse through subdirectories and 1 file won't load, will it still parse other file or will it keep waiting forever?

    Thanks, Martijn

    I am assuming you have WatcherInputIncludeSubdirectories set to True. The FileSystemWatcher class will watch a directory and all its sub-directories. It fires once after detecting the initial file that was placed somewhere in the directory hierarchy it is watching and that is the file name that it stores for later use by the rest of the code. If that initial file is held open indefinitely then it will cause the Script Task to wait indefinitely despite more files possibly being added to the directory hierarchy.

    A nice addition to the code might be to time the process out inside the WaitForExclusiveAccess method per the WatcherInputTimeoutSeconds variable, possibly incorporating the total time spent waiting for the file to arrive plus the time waiting for exclusive access.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Thanks for the step-by-step... I've had a project like this on the horizon for a while (multiple ASII files). Your article helps coalesce a lot of thoughts into a simple, straightforward approach.

    Cheers from Denver (also).

    Mark
    Just a cog in the wheel.

  • This is a fantastic article that provides very clear instructions and a good comparison to see when it should be used and when other options would not be appropriate. Thank you for posting it.

    ---
    Timothy A Wiseman
    SQL Blog: http://timothyawiseman.wordpress.com/

  • The Script Task brings with it some flexibility and ease of use that offer us some advantages over the other two techniques, yet has some potential drawbacks of its own.

    This is from the last paragraph, but looking through the table there didn't seem to be an obvious drawback. Did you mean in general or just depending on the environment? Could you give a specific drawback?

    Thanks,

  • Thanks Mark and Timothy for the kind feedback. That's the kind of stuff that keeps me going!

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

Viewing 15 posts - 16 through 30 (of 62 total)

You must be logged in to reply to this topic. Login to reply