Looping through multiple excel sheets and deleting the blank sheets

  • Hi All,

    Iam working on a ssis 2005 package wherein i use a dataflow task to loop through all the sheets of an excel and save them in datareader. the dataflow task consists of a script component to get the sheet names and a datareader destination to store the values in a table.

    Next, I use a foreach loop container to loop through all the sheets in the datareader table.

    Inside the Foreach container I have to check if the sheet is empty or not. If empty, i have to delete the sheet or else i have to proceed further to store it in the database.

    Can anyone please help me in completing this ?

    With Regards,
    Anu..;-):hehe:

  • ok...my initial thoughts...

    not that straight forward as the for..next container doesn't have the functionality to go through worksheets

    so, i propose

    1) a script task that opens the workbook, loops through the pages, deletes the blanks and dumps the worksheet name into a table in your database with a simple bit of SQL

    2) create a WorksheetName package string variable

    3) in a for..next container, loop through the table, populate the variable

    4) use the variable as your sheetname in your data source

    oh and clear the table first

    oh and you may need to stick a "$" on your worksheet name

  • in fact, it was puzzling so much, i had a crack at the script task

    ' Microsoft SQL Server Integration Services Script Task

    ' Write scripts using Microsoft Visual Basic

    ' The ScriptMain class is the entry point of the Script Task.

    Imports System

    Imports System.Data

    Imports System.Math

    Imports Microsoft.SqlServer.Dts.Runtime

    Imports Microsoft.Office.Interop.Excel

    Imports System.Data.SqlClient

    Public Class ScriptMain

    Private _app As Microsoft.Office.Interop.Excel.Application

    Private _books As Microsoft.Office.Interop.Excel.Workbooks

    Private _book As Microsoft.Office.Interop.Excel.Workbook

    Protected _sheets As Microsoft.Office.Interop.Excel.Sheets

    Protected _sheet As Microsoft.Office.Interop.Excel.Worksheet

    Public Sub Main()

    '

    Const WorkbookPath As String = "U:\doodle\DummyTables.xls"

    Dim i As Integer

    'Try

    OpenExcelWorkbook(WorkbookPath)

    For Each _sheet In _sheets

    'MsgBox(_sheet.Name)

    AddTable(_sheet.Name)

    Next

    Dts.TaskResult = Dts.Results.Success

    'Catch

    'Dts.TaskResult = Dts.Results.Failure

    'End Try

    End Sub

    Protected Sub OpenExcelWorkbook(ByVal fileName As String)

    _app = New Microsoft.Office.Interop.Excel.Application()

    If _book Is Nothing Then

    _books = _app.Workbooks

    _book = _books.Open(fileName, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing)

    _sheets = _book.Worksheets

    End If

    End Sub

    Protected Sub AddTable(ByVal Worksheet As String)

    Dim con As New SqlConnection("Server=yourserver; database=yourDB;Trusted_Connection=Yes ")

    Dim cmd As New SqlCommand("insert into zz_Worksheet (Worksheet) values ('" & Worksheet & "')", con)

    con.open()

    cmd.ExecuteNonQuery()

    con.close()

    End Sub

    End Class

    was a bit of a mission getting Microsoft.Office.Interop.Excel from the web into the GAC...should be standard really

  • Thanks Darth....I would definetely try it out and let you know.

    I have followed an example in the below link for looping through the excel sheets.

    http://biresort.net/blogs/pedrocgd/archive/2009/08/06/stepbystep-ssis-extract-data-from-multiple-excel-worksheets.aspx

    Could you please check it out and let me know where in i have to modify the code in script component to achieve my requirement.

    I have made a small change in the code but it is not giving the desired result , please check it and let me know.

    For Each tableInFile In tablesInFile.Rows

    Me.Output0Buffer.AddRow()

    Me.Output0Buffer.colSheet = tableInFile.Item("TABLE_NAME").ToString().Trim()

    If tableInFile Is Nothing Then

    tableInFile.Delete()

    End If

    excelTables(intTableIndex) = currentTable

    intTableIndex += 1

    Next

    Thanks for bearing with me....:-)

    With Regards,
    Anu..;-):hehe:

  • hi Darth....Iam not able to add Microsoft.Office.Interop.Excel in my script.

    Here is what i have followed.

    1. I closed the package and opened the code view by rt clicking on the package and selecting view code

    2. In the XML file I have added a referencePath :

    <Settings

    DefaultNamespace = "ScriptTask_6f1760699c434992b370d7210759151d"

    OptionCompare = "0"

    OptionExplicit = "1"

    OptionStrict = "1"

    ProjectName = "ScriptTask_6f1760699c434992b370d7210759151d"

    ReferencePath = "C:\WINDOWS\assembly\GAC_MSIL\Microsoft.SqlServer.ScriptTask\9.0.242.0__89845dcd8080cc91\;C:\WINDOWS\assembly\GAC_MSIL\Microsoft.SqlServer.ManagedDTS\9.0.242.0__89845dcd8080cc91\;C:\Windows\assembly\GAC\Microsoft.Office.Interop.Excel\11.0.0.0__71e9bce111e9429c\"

    3. Now in References I added a reference :

    <Reference

    Name = "Microsoft.Office.Interop.Excel"

    AssemblyName = "Microsoft.Office.Interop.Excel"

    />

    4. Now I added this reference in imports block my script task and i can also see this in my references.

    the problem is when i try to write any code using "Microsoft.Office.Interop.Excel" Iam getting error as it is not defined.

    what am I missing in this part?

    With Regards,
    Anu..;-):hehe:

  • Try adding the reference through the UI?

    Steve.

  • Steve ...

    U mean the package script task ??? how do I do it?

    This package will run as a job scheduled to run every day....

    With Regards,
    Anu..;-):hehe:

  • erm...not sure why you are tinkering in xml

    to add a DLL to the references, you need to

    a) find it

    http://social.msdn.microsoft.com/Forums/en/netfxsetup/thread/c9e83756-4ae2-4ed4-b154-1537f3bb3a22

    not sure which version of windows you have, so will leave it to you to find

    b) download it

    c) stick it in your GAC C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727

    d) stick it in your assembly

    C:\WINDOWS\assembly

    e) add it to the references in your script task

    bit of a ball ache i know, welcome to dot net!

    if it's any consolation, i have tried it on my PC and it works fine 🙂

  • thanks darth..

    it did took a lot of time for me to implement this but finally worked for me....

    With Regards,
    Anu..;-):hehe:

  • Dear Darth,

    Need your help again !!!!!!

    Let me first explain you the scenario.......

    I need to loop through a folder and import the excel files with the filename CustomerMasterDump.xls into the sql database.

    The CustomerMasterDump.xls contains 3 sheets (CustomerMaster,CustomerHierarchy,SalesHierarchy).

    Depending on the sheet name i need to upload data into different tables.

    i.e. if the sheetname is customermaster i need to dump the data into customermasterdump table and if it is customerhierarchy i need to dump the data into customerhierarchydump table.

    Also the number of columns and the column heading and column values changes depending on the sheet name.

    Is it possible to achive all these in only one package or do i have to create three different packages for all the three sheets and then in another package call the package depending on the sheetname.

    Please help me as im struck up here badly..............

    With Regards,
    Anu..;-):hehe:

  • sounds easy enough to do in vb

    you just need to paraterise the routine, get a for...each loop to go through the folder and then call the routine??

  • you mean to say that it is possible to do in a single package with just vb code in it ???

    Can u give me a clue on how to dynamically call the columns of the sheet ?

    Which is better...creating 3 packages and calling them in another package or creating a single package with different dataflows ..?

    With Regards,
    Anu..;-):hehe:

  • are the column names changing within the same sheets?

    where are all these crappy spreadsheets coming from 🙂

  • the column names are different for all the three sheets....user will be uploading these files from frontend(aspx page). I have integrated the package with asp.net

    With Regards,
    Anu..;-):hehe:

  • yeah but are the column names changing day on day for the same worksheet?

    and what is asp talking to? just go there and cut out the middle man?

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply