Batch ETL of Multiple Data Files Using an SSIS Foreach Loop Container

  • imz (6/13/2013)


    RonKyle (6/13/2013)


    You can actually simplify it even more if you like, by using a File System Task, instead of a Script Task to move your processed files once done. That way you don't even need to write any code!

    Having just spent the past several days developing imports for multiple csv and xls files, I would recommend that a script task would be best to do the move. The move file task seems to be buggy, at least with variables, and the suggested work around of using a copy followed by a delete affects file properties such as creation date.

    I have used the File task with variable and never encountered a problem so I'm curious to know what issues have you had and in what scenario(s). Do you have some examples you could share?

    Thanks in advance

    File system task with a variable works just fine for me, not buggy at all.

  • Wonderful.

    Share a website with you ,

    ( http://www.femalemalls.com/ )

    Believe you will love it.

    We accept any form of payment.

    http://www.femalemalls.com/Michael-Kors-Handbags-n2447/

  • Would you be able to send a snapshot of the appropriate tabs on your file task system. I would be very interested to see it.

    Thanks,

  • Here are some screenshots of how I use the File Task.

    First I have ForEach loop that will scan a folder looking for files matching a prefix. Here I'm using variables for folder path, file mask (ForEach1.jpg) and to capture the names of each file found in another variable (ForaEach2.jpg). Inside the ForEach loop I have the File Task that will move a file using the "Move File" operation (FileTask1.jpg). You could also use the "Rename File" to move a file if you need to rename the file as you move it (e.g. add a time stamp to the file name).

  • I'm really puzzled by this as that's exactly how I had it set up. The first two steps are still that way since w/o that the code wouldn't work as well. It's very odd that the same variables that don't execute using the File Task work fine as code. The file is getting renamed, so maybe I will try it again using the rename even though the move function works in code. I'll post the results one way or the other.

  • The "Move File" operation will not allow you to change the name so the destination can only contain a path to the file folder without the file name. If you want to move it and give it a different name at the same time then you need to use the "Rename File" operation. It was confusing to me at first and I still don't understand why they chose to give this functionality to the "Rename File" operation instead of giving it to the "Move File" one or even create another called something like "Move and Rename File".

  • Doing it as a rename worked. I'm all for using code, but if a non-code solution will do, I'll use that instead. Thanks,

  • I've used both Script Tasks and File System Components to perform file manipulation but try to use native SSIS components to perform an operation when available meaning I tend towards the File System Task unless there's specific reason not to.

    Because the Script tasks are basically just wrappers for C#/VB .net code you could effectively write a whole application to perform any operation all the way down to a full GUI if desired, but that's not really what it was intended for. The script task is very powerful but it obfuscates the tasks that it performs by removing the logic from SSIS and placing it in a different environment. It's more a personal preference than anything but I just like being able to follow the logic in a package without having to open up a script task, open the script in a separate Visual Studio code window, look at which DLLs were loaded to perform specialized operations, try to figure out which variables are passed back and forth etc.

    Regarding the loading of multiple files of the same format one thing that has been very handy in certain circumstances is the Multiple Flat File connection manager. You can use the same kind of wildcard matching that you'd use in a for each file loop and treat any number of like-formatted txt files as a single file. It's especially handy if your data flows include lookups or other operations that have to occur every time the data flow is loaded since they will only occur once as opposed to every iteration of the outside loop. Not a solution in all cases since you wouldn't be able to do file-by-file archiving as easily but useful nonetheless.

    Kris

  • I tried the code that was suggested in the visual basic 2008 script and get an error for the following statement:

    Dim input_file As New FileInfo(Dts.Variables("FilePath").Value.ToString)

    The error is "Type 'FileInfo' is not defined."

    Do you have an answer for that?

  • Add "Import IO.System" to the import section of the script task.

  • To be able to use FileInfo, you have to import library reference Microsoft.SqlServer.SmoExtended if you are using SQL2008 or SQL2008R2 and then use it in your Script Main window by Using Microsoft.SqlServer.Management.Smo;. You might want to bring Microsoft.SqlServer.Management.Collector and Microsoft.SqlServer.Management.Common as well.

  • When I try adding Imports IO.System to the visual basic 2008 script contained in this article, I get the following error:

    Namespace or type specified in the Imports 'IO.System' doesn't contain any public member or cannot be found. Make sure the namespace or the type is defined and contains at least one public member. Make sure the imported element name doesn't use any aliases.

    Has anyone actually tried to run the script by Stan Kulp "Batch ETL of Multiple Data Files Using an SSIS Foreach Loop Container" ?

  • When I try adding Imports Microsoft.SqlServer.SmoExtended to the visual basic 2008 script contained in this article, I get the following error:

    Namespace or type specified in the Imports 'IO.System' doesn't contain any public member or cannot be found. Make sure the namespace or the type is defined and contains at least one public member. Make sure the imported element name doesn't use any aliases.

    Any other ideas?

  • I just ran it and it worked perfectly. I wrote it in 2008, but just made it work in 2012.

    What version of SSIS are you using? Did you use the BatchETL.dtsx file in the BatchETL.zip resources file?

  • I'm using version 2008 R2. In my SSIS package I have a for each loop container that reads multiple xml files and inserts data into sql tables. That works fine. The last step in the for each loop is a Script Task for archiving the XML files to another folder. So, I copied exactly the code in your example. The only change I made was the connection for the archive folder which is the Dim DestinationDirectory. One other change was instead of:

    Dim input_file As New FileInfo(Dts.Variable("FilePath").Value.ToString)

    I used:

    Dim input_file As New IO.FileInfo(Dts.Variable("FilePath").Value.ToString)

    Without making that change I was getting the error: "Type InfoFile is not defined"

    Now I'm getting this error during execution:

    Error: System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. ---> System.ArgumentException: The path is not of a legal form.

    at System.IO.Path.NormalizePathFast(String path, Boolean fullCheck)

    at System.IO.FileInfo..ctor(String fileName)

    at ST_b1430fd9112f48ef96919934e7029f42.vbproj.ScriptMain.Main()

    --- End of inner exception stack trace ---

    at System.RuntimeMethodHandle._InvokeMethodFast(Object target, Object[] arguments, SignatureStruct& sig, MethodAttributes methodAttributes, RuntimeTypeHandle typeOwner)

    at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture, Boolean skipVisibilityChecks)

    at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)

    at System.RuntimeType.InvokeMember(String name, BindingFlags bindingFlags, Binder binder, Object target, Object[] providedArgs, ParameterModifier[] modifiers, CultureInfo culture, String[] namedParams)

    at Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTATaskScriptingEngine.ExecuteScript()

    Here is my script:

    Imports System

    Imports System.Data

    Imports System.Math

    Imports Microsoft.SqlServer.Dts.Runtime

    <System.AddIn.AddIn("ScriptMain", Version:="1.0", Publisher:="", Description:="")> _

    <System.CLSCompliantAttribute(False)> _

    Partial Public Class ScriptMain

    Inherits Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase

    Enum ScriptResults

    Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success

    Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure

    End Enum

    Public Sub Main()

    Dim DestinationDirectory As String = "T:\Developers\Ron.Dore\CyberCom\CyberComXMLfilesArchive\"

    Dim file_name As String() = Split(Dts.Variables("FilePath").Value.ToString, "\")

    Dim input_file As New IO.FileInfo(Dts.Variables("FilePath").Value.ToString)

    Try

    input_file.MoveTo(DestinationDirectory & file_name(2))

    Catch ex As Exception

    End Try

    End Sub

    End Class

    Thanks for your help.

Viewing 15 posts - 16 through 30 (of 30 total)

You must be logged in to reply to this topic. Login to reply