Control Flow: selectively consuming multiple source files

  • I've got to find a way for my foreach loop container to recognize the difference between

    file1_2014_06_20

    file2_2014_06_20

    file3_2014_06_20

    and

    file1_2014_06_21

    file2_2014_06_21

    file3_2014_06_21

    then import just the three files meeting date criteria.

    I tried to accomplish this by setting up 3 package variables, all set to evaluate as expressions:

    Name: namePart/Datatype:String/Value: substring(@[user::zippedFile], 1, 5)

    Name: datePart/Datatype:String/Value: substring(user::zippedFile], 6, 11)

    Name: zippedFile/Datatype:String/Value: @[user::namePart]+@[user::datePart]

    I use the zippedFile variable in the Foreach Loop Container, with index of 0, to 'become' the name of the whatever file it encountered.

    In foreach loop container collection tab I specify File enumeration, the Folder path as C:\Users\me\Downloads\MarinZipped, files as *txt.zip. In variable mappings I set user::zippedFile with Index of 0.

    Inside the Foreach Loop Container is an execute process task that unzips the source files and puts them in a drop folder.

    When I run this, against folder with 6 files and different dates, it unzips and moves only the first 3 files. The 4th file it deems as duplicate and black box comes up asking if I want to overwrite, even though it has a different date.

    I hoped having a 'datePart' variable would allow for the zippedFile date part to be evaluated at run time, but it appears not to.

    Please suggestions?

  • Where exactly do you use the datePart variable in your for each loop container configuration?

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • while you were replying I was editing my entry. Important missing part that I added was :

    Name: zippedFile/Datatype:String/Value: @[user::namePart]+@[user::datePart]

  • If you put zippedFile in the variable mappings, the for each loop will overwrite that variable each iteration of the loop with the current filename.

    If you want the for each loop to only loop over files ending with 2014_06_20 for example, you need to configure the loop with the wildcard *2014_06_20.txt.zip.

    You can do this through an expression.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • no that won't work for me, because I'm trying to work toward a solution where ultimately I can supply the datePart dynamically.

    And what I mean by dynamically, is that I plan to build an execute sql task (upstream) that evaluates the datepart as being <=getdate() but > than the datecolumn in a table which has a processed flag set to Y or N.

    I am interested in consuming file1-3 for all dates, but I must be able to control the order dynamically....

    Can you alter suggestion within this context (see this post if you have time 🙁http://www.sqlservercentral.com/Forums/Topic1592057-364-1.aspx)?

  • I would get all the dates that need to be processed and put them in an object variable.

    In a for each loop, I would loop over all of the dates stored in the object variable.

    Inside this for each loop, there is another for each loop where you use the date in the wildcard filter. In the inner for each loop, you deal with the files.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • three foreach loops, one nested inside the other.

    still worried about how to pass the datepart as part of the filename, since the above didn't work....

    is it correct that filename will be built up using an expression using two variables: filepart and datepart?

  • I'm getting this error:

    Warning: The For Each File enumerator is empty. The For Each File enumerator did not find any files that matched the file pattern, or the specified directory was empty.

    I've got a Execute Sql Task followed by a Foreach Loop container, inside of which is an Execute Process Task.

    Goal: use the Execute Sql Task to get the timestamp of the file I intend to process (eg. 2014-06-20), pass that time stamp to the Foreach Loop Container and use it inside the Foreach File Enumerator expression for the property FileSpec ( "*"+@[User::datePart]+".txt.zip"), and pass this to the File Process Task, to unzip just those files that have the 2014-16-20 timestamp.

    SourceFolder looks like this:

    File1_2014-06-20.txt.zip

    File1_2014-06-21.txt.zip

    File2_2014-06-20.txt.zip

    File2_2014-06-21.txt.zip

    File3_2014-06-20.txt.zip

    File3_2014-06-20.txt.zip

    I created a TimeTbl to contain all dates, with column as to whether that date was processed or not. DDL:

    CREATE TABLE [dbo].[Timetbl](

    [TimeCol] [datetime] NULL,

    [IsProcessed] [bit] NULL

    ) ON [PRIMARY]

    GO

    inserted into table, 4 rows

    insert into Timetbl

    values

    ('2014-06-18', 1),

    ('2014-06-19', 1),

    ('2014-06-20', 0),

    ('2014-06-21', 0);

    PACKAGE

    2 package level variables:

    name:datePart\scope: Package1\datatype: string

    name: zippedFile\scope: Package1\datatype: string

    Control Flow:

    Execute SQL Task: make a Datepart to use in Foreach Loop containers FileSpec expression

    General

    Result Set: single row

    connectionType: oledb

    sqlsourcetype: direct input

    sqlstatement:

    select top 1 CONVERT(varchar(10), TimeCol, 20) as TimeCol

    from Timetbl

    where IsProcessed = 0

    order by timecol asc

    Parameter Mapping

    [nothing]

    Result Set

    Result Name: TimeCol

    Variable Name: User::datePart

    Expressions

    [nothing]

    Foreach Loop Container

    Collection

    Enumerator: Foreach File Enumerator

    Expressions: Expressions: FileSpec\"*"+@[User::datePart]+".txt.zip"

    Folder: C:\SourceFolder

    Files: *.txt.zip

    VariableMappings:

    variable: User::zippedFile\index:0

    Execute Process Task Editor

    Process

    Executable C:\Program Files (x86)\7-Zip\7z.exe

    Arguments e -oC:\Users\me\Downloads\test2

    Expressions

    Arguments\"e " +@[User::zippedFile]+ " " +"-oC:\\Users\\me\\Downloads\\test2"

  • Shouldn't the variable that holds date values be of 'Object' type than 'String'?

  • Put a breakpoint on the for each loop and before it starts iterating check if all the values of the variables and the expressions are what you expect.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • I don't think the datePart variable should be set to Object.

    Per simple talk article

    https://www.simple-talk.com/sql/ssis/ssis-basics-using-the-execute-sql-task-to-generate-result-sets/

    "Using the Execute SQL task to return a full result set is similar to returning a single-row result set. The primary differences are that your target variable must be configured with the Object data type, and the task’s ResultSet property must be set to Full result set. "

    ...besides, I gave a whirl and the package failed on the execute process task, whereas when I use nvarchar the remainder of the tasks complete...and error with :

    "Warning: The For Each File enumerator is empty. The For Each File enumerator did not find any files that matched the file pattern, or the specified directory was empty."

    I didn't find a way for checkpoint to display what it's passing forward, for the datePart...so I have no idea. Seems like it's coming down to the variable, because I've declared it to be equal to 2014-06-20 vis a vis the query, which matches the datepart in the name.

    Puh luz, more ideas?

  • KoldCoffee (7/17/2014)


    ...besides, I gave a whirl and the package failed on the execute process task, whereas when I use nvarchar the remainder of the tasks complete...and error with :

    "Warning: The For Each File enumerator is empty. The For Each File enumerator did not find any files that matched the file pattern, or the specified directory was empty."

    That is a warning, not an error. It just says the for each loop didn't find a file, so it didn't loop.

    You can put a breakpoint on a task by right-clicking on it and selecting Edit Breakpoints. Put it on the PreExecute event.

    When the package stops, take a look at the Locals window.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • ok, I don't see the value of the breakpoints. I set it as advised, and the breakpoints window is only telling me that it ran: name, hit count, file, address, data, process. I do not see a locals window.

    are you saying that the above configuration is pretty close to what should work? because if so, maybe the only issue is datatype...

    like I said, breakpoints aren't given me additional information. The datepart variable obviously isn't evaluating to the date, therefore *+user::datepart goes unheeded.

  • could someone take the ddl i provided and details of set up and try to reproduce and find the location of problem? I'm stumped.

  • Koen Verbeeck (7/16/2014)


    Put a breakpoint on the for each loop and before it starts iterating check if all the values of the variables and the expressions are what you expect.

    Koen, I put a breakpoint on the task that executes a sql task to get a date from the TimeTbl, and it returned 2014-06-20 00:00:00 instead of 2014-06-20. (I finally was able to see how breakpoints work).

    have updated the execute sql task to select top 1 convert(varchar(10), timecol,120) as timecol

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply