SSIS 2012 - How to Query Excel File Using "Execute SQL Task"

  • First of all, I don't see a SSIS 2012 forum here, just this SSIS 2005 forum, so I hope it's o.k. that I'm posting in this particular forum since it's the closest to what I'm actually looking for...

    I'm developing a SSIS 2012 solution for a client. The gist of the project is that it basically runs once daily to loop through subfolders present in a particular path. If new Excel files are found to have been placed in the various subfolders since the SSIS package ran 24 hours earlier, then the process runs, loops through the various Excel files which are all different structurally, and does ETL on all of the files and places the transformed data into individual Excel spreadsheets (via the use of an Excel template file) to where data from each Excel file now follows the same Excel file structure.

    There more to the project than what I stated above, but that's the main component of the project.

    My client has requested an email to be sent along with inclusion of a few columns of data from each originating Excel file based on the following criteria. Two of the fields commonly provided in each Excel file are "Hire Date" and "Rehire Date". I'm needing to create what I assume would be an Execute SQL Task as each original Excel file is looped through, and if there's a "Rehire Date" present that's greater than the "Hire Date" in each row, the client would like to receive an email with just a few columns of information (Person's Name, Social Security#, Hire Date, & Rehire Date) to sort of serve as a flag for them to call the provider of the specific Excel files where the above criteria has found to exist to verify with them that the "Hire Date" and "Rehire Date" data was provided correctly.

    The situation is complicated by the fact that the Worksheet Names for each of the incoming Excel files can be different file to file. I've established some coding that dynamically determines what each Excel file's worksheet names are, so I'm hopeful I can somehow use that variable in constructing the SQL statement something like "Select * from [User::vSheetName] where Rehire Date > Hire Date". Is that possible?

    I'd also need to perform a count of the above query to where if there are 0 results returned, the sending of the desired email doesn't occur. The resulting couple of columns from the query in the email don't of course need to be sent until the situation of "Rehire Date > Hire Date" is actually found to produce at least one record via the query.

    Lastly, due again to the fact that all of the incoming Excel files are different in many cases, many of the Excel files don't even have the "Rehire Date" field provided, and if that's the case, I'd need to somehow determine that this particular field isn't even present in the Excel file and therefore this Execute SQL Task doesn't even need to be processed since the "Rehire Date" field isn't even a part of the original Excel file.

    I've found some decent examples online that deal with writing Execute SQL Tasks to query against an underlying SQL Server database, but I'm not finding many examples that delve into any special nuances that exist when you're trying to query from an Excel spreadsheet via an Execute SQL Task.

    Many thanks for any clarification / examples / links to good examples anyone can provide here.

  • I´m not certain that you are on the right track here with the Execute SQL Task, it has no means of accessing the Excel spreadsheet other than OPENROWSET or such. My take on this would be to write whats needed to a table in a database and then run a reporting/email job from there.

    😎

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply