Importing improperly formatted excel file

  • business people sent me an excel file.

    looks like the picture. this is not very well data friendly. there is no column name.

    They want to import data from this excel sheet to sql server using ssis.

    in SQL server, a table will be created with two columns, Code and Name.

    it will be like this in sql server.

    Code Name

    10 Energy

    1010 Energy

    101010 Energy Equipment & Services

    101020 Oil,Gas & Consumable Fuels

    10101010 Oils & gas Drilling

    10101020 Oil & Gas Equipments & services

    I do not keep the description.

    Basically, we want to keep the numeric code and name only. Numeric codes are in column a,c,e,g. names are in column b,d,f,h

    is this possible?

  • Is this a one time process or something that will reoccur? If it is a one time process the business will probably be better off entering the data by hand than trying to write code for it. If it is recurring process I'd probably use a script component in the data flow to grab the data I needed.

  • it is reoccurring. thank you for the reply. there are about 350 rows in the excel file. I don't know if I can pick data by column with script component.

  • One-timer: Adda filter to the EXCEL section, filter out empty rows (for every 2nd column) and copy the relevant cells (CTRL +C, CTRL +V).

    Permanent Process: Copy the table to a SQL Server staging table. Either UNION ALL or CROSS APPLY to get the two columns you need and insert it into the target table.

    If you want to keep the description (e.g. in cell H7) you could use a self join with an offset of 1 (assuming the staging table has an identity column and you don't use parallel processing for the insert).



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • I know this is a SQL forum, but for crazy Excel imports, I've found creating macros in VBA to be very helpful. One can move the data to a new sheet and in the format that you wish. You could start by recording a macro as you perform this manually, and looking at the VBA code generated and try hacking it from there (on a test workbook).

    ----------------------------------------------------

  • actually I finished this task with SSIS. it was really nasty and painful.

  • I finished this task with ssis.

    I used sql command to import from excel.

    select f1,f2 from [sheet1$] where f1 like '[0-9]%'

    and do it for other columns.

    put them in 2 columns with Union All

    get the date with VB and array.

  • SSIS is a great tool.Once you connect to your data source within a dataflow task you can create a derived column. Here there are built in string and other operators for which you can pull out a date and re arrange it into a format that a datetime field (as an ole db destination) will recognize. You won't need to pull out a script task every time.

    ----------------------------------------------------

  • i could not find other operators for which i can pull out a date from a cell and text. if you know, let me know.

  • @watchgeek,

    I see that you've already pounded out a solution. It does't appear that this spreadsheet has any proprietary or private information in it. Would it be possible for you to attach it to a post so I can download it and play with it? I'd like to give a non-SSIS solution a go on it just for fun (heh... yeah, I know... I've gotta get a life. :-P)

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • sure. the excel sheet is available online.

    http://www.msci.com/products/indexes/sector/gics/gics_structure.html

    it is the most recent excel file. 2014 one.

  • .

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply