There must be an easier way of doing this...

  • ...I just don't see it yet.

    For the last 3 years I've been focused exclusively on SQL. I spent most of this time working in T-SQL, and some in P-SQL, but I did not jump into SSIS or SSRS during that time. I was primarily an analyst, and this numbed my memory to how much I relied on VB6 when I used to do full development in the past.

    My new position has shown that I will need to get back into a more complete development mind set, and I was caught off guard by this realization. I dusted off my old VB skills with some work, and came up with the attached code (modified only to remove client specific data). This was completed under the gun, and I will be the first to say it is far from the best I could do. In the end it works, but there must be a better way to do it.

    The project sounded relatively simple. The client would load 3 to 4 files every 1.5 hours to our SFTP server. These data files would show authorized transactions (AD), checks actually printed (CI), payments received (PM), and a list of voided checks (CV). Since each of these files have a unique name (including both date and time sections) I cannot just kick off a saved Import Tool SSIS package. These files need to be up and ready to go by 8:00 AM every day, and even loading them while I'm at the house before I come in they would not be ready until around 9:00 AM.

    The solution was to create a command line program which can be launched from a scheduler program. The program copies everything from the FTP folder that is not in the Archive folder into a Staging folder. Once in the staging folder the program edits each file (one by one) to remove the final line (a check sum row), and adds the SourceFileName and BatchDate fields (values found in data file's name). It then looks at the last 2 characters of the file name before the .csv to determine what kind of file this is (AD, CI, CV, or PM), and directs the flow control to the correct sub function. In the sub functions there are a couple of the file types that must account for multiple format possibilities (the client changed it on us a couple of times).

    I ran into an issue that took me some time to identify and find a solution for. Originally I did not have a 5 second pause after after calling the DTSX package. Without the pause the program moves on almost immediately, and tries to delete the file in the staging folder. If the DTSX package is still loading the file then errors begin to occur.

    I also ran into an issue while looking up how to add the timer (I couldn't use the standard Visual Studio timer object since this is a command line program...I had nothing to attach the object to - at least that I knew of). There is a VB tool called Sleep() which is found in a kernel32 library. The problem is that I am on a 64-bit system, and it seems that the Sleep() function is based out of an old x86 32-bit architecture.

    Now that all of the back story is out, here are my questions. Does SSIS have a way of accounting for multiple file names? Does SSIS allow for a file rename step like what I've done here? Is there a better way of calling the DTSX package then using the Shell(Call string) option? Is there a way to avoid the timer option, and just detect a return value from the DTSX package (with this return value I should also be able to kick off an error control step if the package failed)? Finally, was there a simpler way of doing this? If yes, then what is it?

    Visual Studio 2010 Professional

    SQL Server 2008 R2

    [font="Arial"]β€œAny fool can know. The point is to understand.”
    - Albert Einstein

    "DOH!"
    - Homer Simpson[/font]

  • jarid.lawson (2/15/2012)


    Now that all of the back story is out, here are my questions. Does SSIS have a way of accounting for multiple file names?

    The ForEach loop can handle looping through a bunch of files in a target with wildcards in the names. It loads the name (and/or full path) to a local package variable you can throw around. You can attach your datasources to it via expressions.

    Does SSIS allow for a file rename step like what I've done here?

    It does, but I usually prefer to use a Script component and call the filesystem object from the VB script. However, you can use a bunch of your locally detected variables for it.

    Is there a better way of calling the DTSX package then using the Shell(Call string) option?

    I personally usually use SQLAgent.

    Is there a way to avoid the timer option, and just detect a return value from the DTSX package (with this return value I should also be able to kick off an error control step if the package failed)?

    Not so easily. You can use it as a sub-package and have it return items to the master package, but I'm not familiar with returning values externally to another software. There is pretty heavy duty error controls inside of SSIS however.

    Finally, was there a simpler way of doing this? If yes, then what is it?

    I'm not sure what questions to ask yet to answer that question. I'd have to go through it a few times and try to clarify the pain points (in particular, modifying the flat files before being able to really use them) before I could answer that effectively.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Gosh... I just know a bunch of people will come out of the woodwork on my response but I use simple DOS commands via xp_CmdShell and BULK INSERT for this type of stuff. About the only thing I might do using something else is to use VBS to strip off that last line but even that takes time and have found that doing it all in T-SQL/xp_CmdShell does a fine and fast job.

    A more modern alternative would be to use PowerShell for most of this. Ironically, if I used PowerShell for this, I'd call it viz xp_CmdShell. πŸ™‚

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply