SSIS SFTP using WinSCP

  • Hi

    I am using WinSCP to create a SFTP task by following this guide:

    https://winscp.net/eng/docs/guide_ssis

    This works IF the file path to the remote directory is explicitly stated.  For example, the script contains below line

    cd /order_000054933/item_000084322

    The problem is each day the numbers will change so I need below to work:

    cd /order_*/item_*

    However, SSIS task fails when I use wildcards.

    Is there a way to make this work?

    Thank you

  • if your folder location varies it is a bit more complicated and you would be better off using the c# script route than using the execute process task (see https://winscp.net/eng/docs/library_ssis  and https://winscp.net/eng/docs/library#using for the classes available within it).

    always better as it avoids you having to generate a script file in order to execute the ftp.

    and are you copying files from the ftp server or send files to it?

    If sending then you will know (I presume) the numbers above so easier to do.

    if you are copying down from server - is there any part of your process that contains a list of the folders on remote server - or that contains the order numbers you need to copy down? if so use it to determine what to copy

    If not then you will need to use other ftp commands to list the remote server files/folders and then loop through them to process each folder one at the time. - again in C# this becomes easier.

  • Thanks for the reply.  I went with the Execute Process Task approach because it was way easier to setup.  I took a look at the C# one and I was getting lost by the installation process.  Is there any way to get the wild cards to work with Execute Process Task? I am only downloading files.

  • mostly no. some directory mask exist but not the way you need it https://winscp.net/eng/docs/file_mask#directory

    you kind of always need to get a listing of files/folders on remote server onto a C#object or onto a local file and then parse that list.

  • I am trying to setup the C# approach.  The installation documentation isn't clear.  How exactly do you set it up?

  • either install is to GAC (advisable but on  your server you will also need to install it in GAC) or use the assembly.resolve method - both explained on their page

    download page https://winscp.net/eng/downloads.php

    instructions on https://winscp.net/eng/docs/library_install

    and the link I gave previous has some examples of how to load using the assembly.resolve method

    in both cases extract the contents of the zip file to a folder on your software drive (likely C on your own pc, may be another one on the server). and then use the install path on both registering to GAC and within your c# code to reference the exe

  • I got C# working.  If the paths are explicitly set then no problem.  It works.  Now the next part is to use wildcards to access remote directory.  How would I implement this?  Looks like I need coding to do this but I have very limited knowledge of C#.  Any help is much appreciated.

  • can you post the C# code you did - and can you explain in detail what the process is to get the files - is it multiple folders/files or single sets. and what is the pattern for folders/files?

    and do the remote folders get removed after you copy them or if you try again next day will the previous ones still be there?

  • Thank you so much for your help!

    No problem.  Here you go.

    The files are put on the SFTP site daily (M-F) and remain there for 30 days.  After 30 days, the system automatically purges the files.

    The format is "order_number/item_number/PriceFile_YYYY_MM_DD"

    Every day the numbers (_number part) will change depending on what they are.  They are orders so I have no clue on how they determine them.  All I know is that they are numbers so "order_*/item_*" will work fine.  I know I am oversimplifying the logic but essentially this is what is happening.

    It is definitely possible that SSIS job may fail so I have to reload the files.  For this reason, I need a variable assigned called Days_Retrieval that determines how many records should be returned.  For example, if Days_Retrieval = 10 then that means it will return PriceFile that are within 10 days of today's date and including today's date.  Days_Retrieval = 0 means only return today's records.  99% of the time the value for Days_Retrieval will be 0 because I just want today's records.  However, when I buy a new dataset I have to load historical records so in this case I would set Days_Retrieval to a value large enough that all historical records are captured.  In practice, this will happen infrequently as I don't buy new datasets all the time.  I only buy for researching new ideas.  The other use-case would be re-creating the database in case of database corruption.

    #region Help:  Introduction to the script task
    /* The Script Task allows you to perform virtually any operation that can be accomplished in
    * a .Net application within the context of an Integration Services control flow.
    *
    * Expand the other regions which have "Help" prefixes for examples of specific ways to use
    * Integration Services features within this script task. */
    #endregion


    #region Namespaces
    using System;
    using System.Data;
    using Microsoft.SqlServer.Dts.Runtime;
    using System.Windows.Forms;
    using WinSCP;
    #endregion

    namespace ST_8f76706671c04b07ad1667111a5bdc20
    {
    /// <summary>
    /// ScriptMain is the entry point class of the script. Do not change the name, attributes,
    /// or parent of this class.
    /// </summary>
    [Microsoft.SqlServer.Dts.Tasks.ScriptTask.SSISScriptTaskEntryPointAttribute]
    public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
    {
    #region Help: Using Integration Services variables and parameters in a script
    /* To use a variable in this script, first ensure that the variable has been added to
    * either the list contained in the ReadOnlyVariables property or the list contained in
    * the ReadWriteVariables property of this script task, according to whether or not your
    * code needs to write to the variable. To add the variable, save this script, close this instance of
    * Visual Studio, and update the ReadOnlyVariables and
    * ReadWriteVariables properties in the Script Transformation Editor window.
    * To use a parameter in this script, follow the same steps. Parameters are always read-only.
    *
    * Example of reading from a variable:
    * DateTime startTime = (DateTime) Dts.Variables["System::StartTime"].Value;
    *
    * Example of writing to a variable:
    * Dts.Variables["User::myStringVariable"].Value = "new value";
    *
    * Example of reading from a package parameter:
    * int batchId = (int) Dts.Variables["$Package::batchId"].Value;
    *
    * Example of reading from a project parameter:
    * int batchId = (int) Dts.Variables["$Project::batchId"].Value;
    *
    * Example of reading from a sensitive project parameter:
    * int batchId = (int) Dts.Variables["$Project::batchId"].GetSensitiveValue();
    * */

    #endregion

    #region Help: Firing Integration Services events from a script
    /* This script task can fire events for logging purposes.
    *
    * Example of firing an error event:
    * Dts.Events.FireError(18, "Process Values", "Bad value", "", 0);
    *
    * Example of firing an information event:
    * Dts.Events.FireInformation(3, "Process Values", "Processing has started", "", 0, ref fireAgain)
    *
    * Example of firing a warning event:
    * Dts.Events.FireWarning(14, "Process Values", "No values received for input", "", 0);
    * */
    #endregion

    #region Help: Using Integration Services connection managers in a script
    /* Some types of connection managers can be used in this script task. See the topic
    * "Working with Connection Managers Programatically" for details.
    *
    * Example of using an ADO.Net connection manager:
    * object rawConnection = Dts.Connections["Sales DB"].AcquireConnection(Dts.Transaction);
    * SqlConnection myADONETConnection = (SqlConnection)rawConnection;
    * //Use the connection in some code here, then release the connection
    * Dts.Connections["Sales DB"].ReleaseConnection(rawConnection);
    *
    * Example of using a File connection manager
    * object rawConnection = Dts.Connections["Prices.zip"].AcquireConnection(Dts.Transaction);
    * string filePath = (string)rawConnection;
    * //Use the connection in some code here, then release the connection
    * Dts.Connections["Prices.zip"].ReleaseConnection(rawConnection);
    * */
    #endregion


    /// <summary>
    /// This method is called when this script task executes in the control flow.
    /// Before returning from this method, set the value of Dts.TaskResult to indicate success or failure.
    /// To open Help, press F1.
    /// </summary>
    public void Main()
    {
    // TODO: Add your code here

    // Setup session options
    SessionOptions sessionOptions = new SessionOptions
    {
    Protocol = Protocol.Sftp,
    // To setup these variables, go to SSIS > Variables.
    // To make them accessible from the script task, in the context menu of the
    // task, choose Edit. On the Script task editor on Script page,
    // select ReadOnlyVariables, and tick the below properties.
    HostName = (string)Dts.Variables["User::HostName"].Value,
    UserName = (string)Dts.Variables["User::UserName"].Value,
    Password = (string)Dts.Variables["User::Password"].Value,
    SshHostKeyFingerprint = (string)Dts.Variables["User::Fingerprint"].Value
    };

    try
    {
    using (Session session = new Session())
    {
    // As WinSCP .NET assembly has to be stored in GAC to be used with SSIS,
    // you need to set path to WinSCP.exe explicitly,
    // if using non-default location.
    session.ExecutablePath = @"C:\Users\Me\Documents\WinSCP-5.19.5-Automation\WinSCP.exe";

    // Connect
    session.Open(sessionOptions);

    // Download files
    TransferOptions transferOptions = new TransferOptions();
    transferOptions.TransferMode = TransferMode.Binary;

    TransferOperationResult transferResult;
    transferResult =
    session.GetFiles("/order_000249383/item_000181618/PriceFile_*", @"C:\Users\Me\Documents\To_Be_Processed\", false, transferOptions);

    // Throw on any error
    transferResult.Check();

    // Print results
    bool fireAgain = false;
    foreach (TransferEventArgs transfer in transferResult.Transfers)
    {
    Dts.Events.FireInformation(0, null,
    string.Format("Download of {0} succeeded", transfer.FileName),
    null, 0, ref fireAgain);
    }
    }

    Dts.TaskResult = (int)DTSExecResult.Success;
    }
    catch (Exception e)
    {
    Dts.Events.FireError(0, null,
    string.Format("Error when using WinSCP to upload files: {0}", e),
    null, 0);

    Dts.TaskResult = (int)DTSExecResult.Failure;
    }


    Dts.TaskResult = (int)ScriptResults.Success;
    }

    #region ScriptResults declaration
    /// <summary>
    /// This enum provides a convenient shorthand within the scope of this class for setting the
    /// result of the script.
    ///
    /// This code was generated automatically.
    /// </summary>
    enum ScriptResults
    {
    Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
    Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
    };
    #endregion

    }
    }
  • One more comment re Days_Retrieval...

    The files will have a system generated timestamp.  Using this timestamp for calculating Days_Retrieval may not be helpful in all cases.  I will have to give it more thought to ensure that all use cases are passing.  One factor that complicates things if this approach is taken is timing.  There is no guaranteed delivery time so it could happen today or early tomorrow morning.  I am not a coding expert but I think this approach is more complicated.   The "easier" approach is to use PriceFile_YYYY_MM_DD.  Parse out the YYYY, MM, DD and make a date out of it.  This way there is no confusion as the date of the file.

  • Thanks for the info. I'll play around with it.

    could you advise how do you know which files have been processed - and do you keep a copy of them on the FS after you process them?

    normally such a process would be implemented with a local folder structure that would allow you to know which files you have to process, which ones are being processed, those have have been processed and those in error

    something like

    • ..\input
    • ..\processing
    • ..\errors
    • ..\archive

    also a logging table to hold filename and other info to help you decide at what point each file is.

  • try the following - just an example and you will need to play with destination filename/directory

    its only the using session block - no need to post the remaining code

    do note that file/directoryname as likely to be case sensitive on the server

    using (Session session = new Session())
    {
    // As WinSCP .NET assembly has to be stored in GAC to be used with SSIS,
    // you need to set path to WinSCP.exe explicitly,
    // if using non-default location.
    session.ExecutablePath = @"C:\Users\Me\Documents\WinSCP-5.19.5-Automation\WinSCP.exe";

    // Connect
    session.Open(sessionOptions);

    var files = session.EnumerateRemoteFiles("/", "price_*", EnumerationOptions.AllDirectories);

    // files will contain a file per enumeration
    foreach (var file in files)
    {
    // here we would do a SQL server call to check if the file has been processed previously.
    var newfile = true;
    if (newfile)
    {
    var filename = file.FullName.Replace(@"/", "_");
    var filepath = RemotePath.EscapeFileMask(file.FullName);

    Console.WriteLine(filepath);
    TransferOptions transferOptions = new TransferOptions();
    transferOptions.TransferMode = TransferMode.Binary;
    var transferResult = session.GetFiles(filepath, $@"C:\Users\Me\Documents\To_Be_Processed\{filename}", false, transferOptions);

    // Throw on any error
    transferResult.Check();
    // Print results
    bool fireAgain = false;
    foreach (TransferEventArgs transfer in transferResult.Transfers)
    {
    Dts.Events.FireInformation(0, null,
    string.Format("Download of {0} succeeded", transfer.FileName),
    null, 0, ref fireAgain);
    }

    // after file is transferred we could add it to a sql table to flag its filename (and remote folder) as having being downloaded
    }
    }

    }
  • frederico_fonseca wrote:

    Thanks for the info. I'll play around with it.

    could you advise how do you know which files have been processed - and do you keep a copy of them on the FS after you process them?

    normally such a process would be implemented with a local folder structure that would allow you to know which files you have to process, which ones are being processed, those have have been processed and those in error

    something like

    • ..\input
    • ..\processing
    • ..\errors
    • ..\archive

    also a logging table to hold filename and other info to help you decide at what point each file is.

    Thank you for the reply!.  Yes, this is exactly how I planned to structure the SSIS package.  I was going to have two folders (to_be_processed and archive) but the ones you suggested make more sense.

  • frederico_fonseca wrote:

    try the following - just an example and you will need to play with destination filename/directory

    its only the using session block - no need to post the remaining code

    do note that file/directoryname as likely to be case sensitive on the server

    using (Session session = new Session())
    {
    // As WinSCP .NET assembly has to be stored in GAC to be used with SSIS,
    // you need to set path to WinSCP.exe explicitly,
    // if using non-default location.
    session.ExecutablePath = @"C:\Users\Me\Documents\WinSCP-5.19.5-Automation\WinSCP.exe";

    // Connect
    session.Open(sessionOptions);

    var files = session.EnumerateRemoteFiles("/", "price_*", EnumerationOptions.AllDirectories);

    // files will contain a file per enumeration
    foreach (var file in files)
    {
    // here we would do a SQL server call to check if the file has been processed previously.
    var newfile = true;
    if (newfile)
    {
    var filename = file.FullName.Replace(@"/", "_");
    var filepath = RemotePath.EscapeFileMask(file.FullName);

    Console.WriteLine(filepath);
    TransferOptions transferOptions = new TransferOptions();
    transferOptions.TransferMode = TransferMode.Binary;
    var transferResult = session.GetFiles(filepath, $@"C:\Users\Me\Documents\To_Be_Processed\{filename}", false, transferOptions);

    // Throw on any error
    transferResult.Check();
    // Print results
    bool fireAgain = false;
    foreach (TransferEventArgs transfer in transferResult.Transfers)
    {
    Dts.Events.FireInformation(0, null,
    string.Format("Download of {0} succeeded", transfer.FileName),
    null, 0, ref fireAgain);
    }

    // after file is transferred we could add it to a sql table to flag its filename (and remote folder) as having being downloaded
    }
    }

    }

    Thank you so so much!!  This works!! If we met in person I would have treated you to coffee as my way of saying thanks.

    Is there a way to add a check for the Days_Retrieval?

    In your comments you made a mention of making SQL checks. I like this idea.  How would I implement it?

    Again, thank you so much!!  No words can begin to express my gratitude.

  • One more question ...

    From a design perspective... is it better to have one script do it all or should there be different SSIS tasks for each?

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply