Importing Excel data to a table using SSIS

  • Hi all

    I am new to BI and I need some help. I have seen similar links to my question but they don't seem to work so please provide a fresh response 🙂

    I have this Excel workbook that contains numerous worksheets. I am required to loop through each worksheet and import the contents of each worksheet into a separate table in a database. So to clarify,

    Workbook 1, Worksheet 1 IMPORTS TO Database 1, Table 1

    Workbook 1, Worksheet 2 IMPORTS TO Database 1, Table 2

    Workbook 1, Worksheet 3 IMPORTS TO Database 1, Table 3

    Workbook 1, Worksheet 4 IMPORTS TO Database 1, Table 4

    I am to do this using SSIS. As I am new I am still learning the controls, and their configurations, so I need a little help to get me started.

    Thanks in advance.

  • From memory an excel connector will link to a sheet (unless you wish to programatically loop through them)

    If Sheet1 goes to TableA, Sheet2 TableB etc... then I'd be inclined to set up four connectors. It makes documentation of the SSIS neater to as you can see at a glance from where from and where to.

    HTH.

    ---------------------------------------------
    If data can screw you, it will; never assume the data are correct.

  • drew.. (5/8/2011)


    Hi all

    I am new to BI and I need some help. I have seen similar links to my question but they don't seem to work so please provide a fresh response 🙂

    I have this Excel workbook that contains numerous worksheets. I am required to loop through each worksheet and import the contents of each worksheet into a separate table in a database. So to clarify,

    Workbook 1, Worksheet 1 IMPORTS TO Database 1, Table 1

    Workbook 1, Worksheet 2 IMPORTS TO Database 1, Table 2

    Workbook 1, Worksheet 3 IMPORTS TO Database 1, Table 3

    Workbook 1, Worksheet 4 IMPORTS TO Database 1, Table 4

    I am to do this using SSIS. As I am new I am still learning the controls, and their configurations, so I need a little help to get me started.

    Thanks in advance.

    Are the number of worksheets in your workbook fixed? i.e. will there always be n worksheets in the workbook or this may vary? How many worksheets will there be in the workbook? Also, I'm assuming that worksheet 1 will always be exported to table 1, worksheet 2 to table 2 etc... ?

    Regards
    Brian Ellul
    ----------------------------------------------------------------------------------
    Computers are incredibly fast, accurate, and stupid. Human beings are incredibly slow, inaccurate, and brilliant. Together they are powerful beyond imagination.
    - Albert Einstein -

  • Thanks for your replies KevinGurney and brian118

    No, the Excel workbook will not have the same number of worksheets each time. The number of worksheets mary vary.

    Therefore, I cannot just set up n connections mapping each worksheet to its own table.

    One week the workbook may contain 6 worksheets, the following week 10 worksheets, the week after 3 worksheets. So you see I really need a method of looping through the workbook and placing each worksheet's content in a separate table. I would like the package I create to be intelligent enough to loop through the workbook.

    Any suggestions please??

  • Hi Drew,

    I've managed to build a small SSIS package that can iterate through the sheets of a spreadsheet, the unfortunate part is that it is not dynamic in terms of working with any number of worksheets... I'm still working on that but i can safely say that it's possible to iterate throught the worksheets of a spreadsheet and insert the data onto a single or multiple tables.

    Would you be ok with that, partial, solution at this point?

    Regards,

    Aquila Hanise

  • Hi Aquila

    Thanks for your reply.

    If I saw the code I would know whether is it what I'm looking for or if I could work with it.

    My requirements would be to loop through each worksheet (in a single Excel workbook) and create a new table (all tables to be in a single database) containing the contents of each worksheet (see below):

    Workbook1 AppleWorksheet IMPORTS TO Database1 AppleTable

    Workbook1 PearWorksheet IMPORTS TO Database1 PearTable

    Workbook1 BananaWorksheet IMPORTS TO Database1 BananaTable

    Workbook1 OrangeWorksheet IMPORTS TO Database1 OrangeTable

    One week it may be the above, the following week the BananaWorksheet may not be present and therefore no BananaTable is required to be created for that week.

    As mentioned before it will work if I just did individual connections, so all I want to do is get a loop to loop through each worksheet and create a table based on the looped worksheet????

    Thank you again.

  • Alright,

    Firstly, I would like us to move from the same premise. In my solution, I assumed the following:

    - The tables (destination tables) have already been created or exist, TableName1, TableName2,...etc.

    - The sheet names are standards, e.g. Sheet 1, Sheet 2, ...etc.

    - That you are able to set up the connection manager for your Excel File and destination Database.

    In order to have the ability to iterate through the tables (destination tables) and excel sheets (data sources) when loading, one needs to use variables in order to control the changes to the table and sheets of xlsx file. I therefore created the following variables:

    - g_table_name (string: stores table names): default value - "table_name_1"

    - g_sheet_number (int: stores current sheet number to load): default value - 0

    - g_sheet_name (string: stores current sheet name to load): default value - "Issued - 1$"

    Steps to creating the solution:

    1: Drag the "For Loop Container - Control Flow Item" onto the design surface - [This provides means for iterating through your sheets and tables].

    1.1 Confirgure the properties as follows [Double Click the Object to get to Properties]:

  • Configure the InitExpression: @g_sheet_number = 1
  • Configure the EvalExpression: @g_sheet_number < 5
  • Configure the AssignExpression: @g_sheet_number = @g_sheet_number + 1
  • 1.2 Click Ok

    2: Dump a "Script Task - Control Flow Item" inside the "For Loop Container". [This provides the means to manipulate your variables]

    2.1 Configure it's properties as follows [Double Click the Object to get to Properties]:

  • ReadOnlyVariables: User::g_sheet_number
  • ReadWriteVariables: User::g_sheet_name,User::g_table_name
  • 2.2 Edit The Script [while in the Properties Page, Click "Edit Script", find the Main() Method and Overwrite with the code below]:

    public void Main()

    {

    // TODO: Add your code here

    Dts.Variables["g_sheet_name"].Value = "";

    Dts.Variables["g_table_name"].Value = "";

    Dts.Variables["g_sheet_name"].Value = "Issued - " + Dts.Variables["g_sheet_number"].Value + "$";

    Dts.Variables["g_table_name"].Value = "table_name_" + Dts.Variables["g_sheet_number"].Value;

    Dts.TaskResult = (int)ScriptResults.Success;

    }

    2.3 Click Save & Close

    3. Dump the Data Flow Task inside the "For Loop Container"

    3.1 Inside the Data Flow Task, drag and drop the "Excel Source - Data Flow Source" and Set the properties of the Excel Source. (PLEASE NOTE: YOU MAY HAVE TO SET UP THE SOURCE CONNECTION MANAGER FOR YOUR EXCEL FILE, this is not covered in this post)

  • Data Access Mode: Table Name or View Name Variable (This allows one to choose from the Variable Name from the drop down list box the variable containing the table names). Select g_sheet_name from the variable list.
  • 3.2 Inside the Data Flow Task, drag and drop the "Ole DB Destination - Data Flow Source" and Set the properties of the Ole DB Destination. (PLEASE NOTE: YOU MAY HAVE TO SET UP THE Destination CONNECTION MANAGER FOR YOUR DATABASE, this is not covered in this post).

  • Data Access Mode [Connection Manager tab in properties]: Table Name or View Name Variable (This allows one to choose from the Variable Name from the drop down list box the variable containing the table names). Select g_table_name from the variable list.
  • Map the columns from the Input Column and Destination Column [Mappings tab in properties]
  • 3.3 Click Ok

    4. Save the SSIS Package and Test

    That's it. I'll see if I can't make this more dynamic by adding the part where it can calculate retreive number of sheets and load according to that. For now, this is as far as I got.

    Regards,

    Aquila

  • Nope. Only the database has been created. As I do not know how many worksheets there will be when the SSIS package is run each week I would like to create the tables on the fly.

    For example, the package runs >> Some sort of loop is executed that loops through each WORKSHEET present in the WORKBOOK >> as each loop is executed (the number of loops depends on the number of WORKSHEETS) a new table is created in the, already exisiting, database to contain the WORKSHEETS data.

    Can this be done???

  • Hi all

    Thank you for your replies, and a big THANKS to 'Mr Quilz' for the detailed response.

    I used some methods of that reply but mostly used a different method as I ended up with a different approach.

    Thanks.

  • well, thanks for sharing your answer. Not.

    Shame on you after so many folk tried to help you. This is a community after all.

  • Yes, please share your findings drew..

  • No strong words plz.....

  • Viewing 12 posts - 1 through 11 (of 11 total)

    You must be logged in to reply to this topic. Login to reply