EXCEL HELL!!!

  • Please see the attached spreadsheet and please understand that when it comes to SSIS, I don't even qualify as an informed neophyte.

    I don't need a demonstration or step by step instructions here (although you could post either or both to help other folks out). What I'd like to know is if SSIS could successfully import this spreadsheet so that the data ends up looking like the results in the graphic from an SSMS grid below?

    Again, I don't know much about SSIS but I don't believe this could be done without some sort of script or call to an external program but would love to be called wrong on this. Please let me know if it can be.

    As a bit of a sidebar, on a scale of 1 to 100 with 100 being the most difficult, how would you rate this type of spreadsheet import problem?

    Thanks for the help, folks.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Importing a spreadsheet like this in SSIS would neither be considered trivial nor complex, more in the category of multi document files, on the scale of 1-100 it could be around 25.

    😎

    This can easily be achieved without any scripting or external program calls although such methods tend to be somewhat more sensitive to structural changes in the spreadsheet. Quick outlining of such a process could be along these lines:

    Sample data

    +--+------------+----+--------+-------+----+----+--------+----+----+-----+----+----+-----+---+----+

    |LN|F1 |F2 |F3 |F4 |F5 |F6 |F7 |F8 |F9 |F10 |F11 |F12 |F13 |F14|F15 |

    +--+------------+----+--------+-------+----+----+--------+----+----+-----+----+----+-----+---+----+

    | 1|Class Scores| | | | | | | | | | | | | | | <-- Header

    | 2| | | | | | | | | | | | | | | | <-- Ignore (blank)

    | 3| | | |2014 | | | | | | | | | | | | <-- Group 1

    | 4| | | |January| | |February| | |March| | |Total| | | <-- Group 2

    | 5|Name |ID |Status |Phy |Mat |Chem|Phy |Mat |Chem|Phy |Mat |Chem|Phy |Mat|Chem| <-- Group 3

    | 6|Name1 |ID1 |Status1 |1 |2 |3 |4 |5 |6 |7 |8 |9 |12 |15 |18 | <-- Row Data

    | 7|Name11 |ID11|Status11|11 |12 |13 |14 |15 |16 |17 |18 |19 |42 |45 |48 | <-- Row Data

    | 8|Name21 |ID21|Status21|21 |22 |23 |24 |25 |26 |27 |28 |29 |72 |75 |78 | <-- Row Data

    | 9|Name31 |ID31|Status31|31 |32 |33 |34 |35 |36 |37 |38 |39 |102 |105|108 | <-- Row Data

    |10| | |Total: |64 |68 |72 |76 |80 |84 |88 |92 |96 |228 |240|252 | <-- Summary Data

    +--+------------+----+--------+-------+----+----+--------+----+----+-----+----+----+-----+---+----+

    1. Import col#1, row#1 as Header label.

    2. Import col#4, row#3 as Header year.

    3. Import col#4,7,10,13 row#4 as Group Headers.

    4. Import col#1-15, row#5 as Column Headers.

    5. Import col#1-15, row#6 - #n-1 as Row Data.

    6. Import col#3-15, row#n as Summary Data.

    Transposing and merging these sets in the data flow is then a straight forward application of tasks such as derived column etc..

    Doing the same using script tasks and/or script transformations is even simpler and often more robust if properly constructed.

    Simplest of all is the ELT method of using the SSIS to import all columns as text into a staging table....;-)

  • Outstanding. Thank you VERY much for the time you've spent on this whether it may have been trivial or not, Eirikur. You obviously took some time to import that bad boy into a "flat" table.

    Glad to see that it's as complex yet as "simple" as I thought it would be. The fun part will be when they add a month between March and the Total. Would that be "self realizing" or what that require a change to the package?

    Doing the same using script tasks and/or script transformations is even simpler and often more robust if properly constructed.

    Simplest of all is the ELT method of using the SSIS to import all columns as text into a staging table....

    Absolutely agreed on the above and that's the point that I'm trying to drive home on someone.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/1/2015)


    Outstanding. Thank you VERY much for the time you've spent on this whether it may have been trivial or not, Eirikur. You obviously took some time to import that bad boy into a "flat" table.

    Glad to see that it's as complex yet as "simple" as I thought it would be. The fun part will be when they add a month between March and the Total. Would that be "self realizing" or what that require a change to the package?

    Doing the same using script tasks and/or script transformations is even simpler and often more robust if properly constructed.

    Simplest of all is the ELT method of using the SSIS to import all columns as text into a staging table....

    Absolutely agreed on the above and that's the point that I'm trying to drive home on someone.

    You really like to throw a spanner or two in the works ain't it :-p, your question is very valid and just, you've hit the bulls eye on the most common reason of why such a packages fail when ELT wouldn't

    😎

    If you need any more statistics or conformation on why the ELT method is superior to any other, just let me know.

    PS. on the time spent, you would do the same!

  • Heh... spanners are my specialty... especially when they span months. 😛

    Thanks again for the help, Eirikur. Your feedback on all of this confirms most of what I thought. One way or another, a script or two (maybe in the form of a stored procedure or two) will need to be used.

    As a bit of a side bar, this all came up because I was using ACE drivers to get the raw data and then a bit of dynamic SQL to resolve it all and someone told me I was nuts... that it should all be able to be done using SSIS only. Of course, I had to admit to them that I don't know much about SSIS but thought that he was wrong about it being script free and easy to do in SSIS only.

    Heh... and, yeah... I'm working on a presentation and an article for this. The two small procs that I came up with are "self-realizing" for the monthly spanner wrench that I threw into the mix. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • This is a great day! I get to tell people I'm as good at something in SQL as Jeff Moden, even if it is SSIS. 😀

  • sqldriver (3/2/2015)


    This is a great day! I get to tell people I'm as good at something in SQL as Jeff Moden, even if it is SSIS. 😀

    😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • By the way - there are several examples already documented. This one's a fairly solid example (describing a similar process)

    http://www.mssqltips.com/sqlservertip/1761/how-to-use-the-unpivot-data-flow-transform-in-sql-server-integration-services-ssis/[/url]

    That said - you're still right - it wouild require some form of a script task to pull out the column headers and compose them into dates. In this case the script would likely be embedded within the data flow in SSIS.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Matt Miller (#4) (3/3/2015)


    By the way - there are several examples already documented. This one's a fairly solid example (describing a similar process)

    http://www.mssqltips.com/sqlservertip/1761/how-to-use-the-unpivot-data-flow-transform-in-sql-server-integration-services-ssis/[/url]

    That said - you're still right - it wouild require some form of a script task to pull out the column headers and compose them into dates. In this case the script would likely be embedded within the data flow in SSIS.

    Excellent article. Thanks for the link, Matt.

    If only spreadsheet users wrote such nice clean spreadsheets that never grew horizontally and never had date related headers that lived on more than one row! 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/1/2015)


    The fun part will be when they add a month between March and the Total. Would that be "self realizing" or what that require a change to the package?

    That would require a change to the package as you are adding extra columns.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Jeff Moden (3/3/2015)


    Matt Miller (#4) (3/3/2015)


    By the way - there are several examples already documented. This one's a fairly solid example (describing a similar process)

    http://www.mssqltips.com/sqlservertip/1761/how-to-use-the-unpivot-data-flow-transform-in-sql-server-integration-services-ssis/[/url]

    That said - you're still right - it wouild require some form of a script task to pull out the column headers and compose them into dates. In this case the script would likely be embedded within the data flow in SSIS.

    Excellent article. Thanks for the link, Matt.

    If only spreadsheet users wrote such nice clean spreadsheets that never grew horizontally and never had date related headers that lived on more than one row! 😛

    DREAM ON!!!! 😎



    Alvin Ramard
    Memphis PASS Chapter[/url]

    All my SSC forum answers come with a money back guarantee. If you didn't like the answer then I'll gladly refund what you paid for it.

    For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • Jeff Moden (3/3/2015)


    Matt Miller (#4) (3/3/2015)


    By the way - there are several examples already documented. This one's a fairly solid example (describing a similar process)

    http://www.mssqltips.com/sqlservertip/1761/how-to-use-the-unpivot-data-flow-transform-in-sql-server-integration-services-ssis/[/url]

    That said - you're still right - it wouild require some form of a script task to pull out the column headers and compose them into dates. In this case the script would likely be embedded within the data flow in SSIS.

    Excellent article. Thanks for the link, Matt.

    If only spreadsheet users wrote such nice clean spreadsheets that never grew horizontally and never had date related headers that lived on more than one row! 😛

    You DO believe in Santa, don't you :w00t:

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Matt Miller (#4) (3/3/2015)


    Jeff Moden (3/3/2015)


    Matt Miller (#4) (3/3/2015)


    By the way - there are several examples already documented. This one's a fairly solid example (describing a similar process)

    http://www.mssqltips.com/sqlservertip/1761/how-to-use-the-unpivot-data-flow-transform-in-sql-server-integration-services-ssis/[/url]

    That said - you're still right - it wouild require some form of a script task to pull out the column headers and compose them into dates. In this case the script would likely be embedded within the data flow in SSIS.

    Excellent article. Thanks for the link, Matt.

    If only spreadsheet users wrote such nice clean spreadsheets that never grew horizontally and never had date related headers that lived on more than one row! 😛

    You DO believe in Santa, don't you :w00t:

    Heh... one look at me lately and you might think I'm Santa. And I've got the natural padding to go with it. Starting to get my BWAAA-HAAAA-HAAA mixed up with my HO-HO-HO! 😛

    Thanks again for the help folks. You've been great.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks Jeff,

    I looked at your spreadsheet. Are you talking about a small application there

    a teacher wants to register scores for the students?

    During a period more students, subjects and months will be added to the spredsheet.

    I use SSIS but I would not use SSIS for this kind of application.

    Excel is an excellent front end to a database (for registration and to present data).

    I use VBA and ODBC (OLE DB). A table (there you can add rows and columns)

    with values in Excel can be inserted in a normalized table as wanted with some coding.

    Maybe a sheet there you register new students and subjects is recommended.

    To analyse the data I use PowerPivot.

    Gosta M

  • Thanks for the notes, Gosta.

    This isn't actually for any application. It was actually an example spreadsheet that I saw on one of the threads that I've modified a bit. If I were building a student scoring system, there'd be a bit more to it, as you say.

    The reason for my questions about SSIS is because I know how to do all of this from SQL Server using T-SQL and the ACE drivers but I don't know much about SSIS except the current 60,000 foot view of what some of it's capabilities and limitations are. These types of spreadsheet-import problems have been presented more and more as questions on these and other forums and I wanted to make sure that it wasn't so easy to do in SSIS that one would be an idiot for not doing it in SSIS before I spent time writing both an article and a presentation on how to do all of this in T-SQL. Part of the goal is to demonstrate that you don't need to have a handle on any thing else other than T-SQL and a very simple call using the ACE drivers to do this.

    As I suspected, it's not fall-off-the-slippery-log simple in SSIS. It's also not a beginner task in T-SQL but it's simple enough and does allow for self-healing/expanding imports to staging tables where you can do whatever else you want/need to do from there.

    Basically, this all falls into one of those items that I have in my signature line below about "not" doing certain stuff in T-SQL. "Not" is in quotes for a reason. 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 63 total)

You must be logged in to reply to this topic. Login to reply