Import Excel file with header in row 5 Using SSIS

  • Hi,

    I would like to import an excel file that contains the headers in row 5.i have been searching the internet but most people talk about using OpenRowSet and i have been trying it but keep on getting errors.

    could anyone suggest a tutorial or help me with step by step process to do this. if there is better way to do this please advise me as i need to automate this other file that we recieve weekly.

    sample below

    --------------------------------------------------------------------------------------------------------------------

    Report Name:Active Workstations - Projects(Estie)

    Report Folder:/ConfigMgr_CE0/_SARS - Custom Reports

    Workstation NameAgent Site CodeSite NameReporting Site

    DHW09808BR4 PafuriBR0

    DHW09809BRB Kosi BayBR0

    DHW09820BR4 PafuriBR0

    DHW09830BRB Kosi BayBR0

    DHW09832BR4 PafuriBR0

    DHW09848BRB Kosi BayBR0

    DHW09854BRB Kosi BayBR0

    --------------------------------------------------------------------------------------------------------------------

  • I don't know of a way to directly tell SSIS to skip the first five rows, but offer this possible method to load your table. I did have to assume some things about your data. The posted sample data headers in row 5 don't appear to align with the data, so I made an Excel spreadsheet with four columns. I put the rows starting with "Report...." entirely in column A, but that wouldn't matter in this solution.

    Report Name: Active Workstations - Projects(Estie)

    Report Folder: /ConfigMgr_CE0/_SARS - Custom Reports

    Workstation Name Agent Site Rpt Site

    DHW09808 BR4 Pafuri BR0

    DHW09809 BRB Kosi Bay BR0

    DHW09820 BR4 Pafuri BR0

    DHW09830 BRB Kosi Bay BR0

    Now for how to load it:

    I created the target table with the column names shown, but (here's the slightly tricky part) they are not used for reading the Excel file. Instead of copying the entire table, use a SQL query to select just the rows you want. You'll have unchecked the "First row has column names" box in the Excel file's source connection manager. In that condition, the Excel columns are named "F1", "F2", "F3" and so on. A simple select statement (that assumes your workstation names all start with "DH") would be:

    Select F1, F2, F3, F4

    from [Sheet1$]

    where F1 like 'DH%'

    Alternatively, exclude the first five rows with something like this:

    Select F1, F2, F3, F4

    from [Sheet1$]

    where F1 not like ' %'

    and F1 not like 'Report%'

    and F1 not like 'Workstation%

    Then, in your dataflow destination, you map the columns F1 to Workstation Name, F2 to Agent and so on.

  • Well, I'm assuming you want to read data from 6th row onwards and then export this to some destination.

    Why don't you first dump the excel into a staging table , which has an identity column (with seed = 1) ,say id, and then from this table you can retrieve only rows that have id>=6.

    Depending on whether you have the liberty of making a staging or temp table, this could be a worthy solution to your problem...

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply