Store Excel File in SQL

  • Hi, I have a high level what-if question to ask regarding SQL. I don't use SQL or Access for that matter but I have a keen interest in what may be a SQL/XML project.

    Is it possible to breakdown an Microsoft Excel file by content (i.e. values, formulas, worksheets, cell addresses, formats, and all other objects) and store this in SQL? Note, I am not looking to import a standard table from Excel. I am looking to breakdown the anatomy of file and store it.

    I am looking for a way to:

    1) archive a lot of spreadsheets into a database as quickly as possible

    2) be able to query them quickly

    3) potentially recreate the spreadsheet seen via a web-browser

    If so, do I need to consider using XML to do this? I know that Excel 2007 has its new XML format.

    Thanks

    EM

  • If you can get the uncompressed XML format of the Excel file, then you can import and shred that.

    Alternatively, you can use Excel automation outside of SQL Server to open the Excel files, break them down and then store them into SQL Server.

    Either of these approaches is probably at least a weeks worth of work. The only quick & dirty method wpuld be to store the spreadsheets into SQL server without breaking them down.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Second thoughts: I am not sure if the XL automation technique can extract the macros or not. I've never tried that.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Thanks for the response. When you say it would take weeks to do, why is this? Where is the time being consumed? Is this assuming many files or is this assuming the coding to build the solution?

    Your alternative option was to use Excel automation. I am assuming I can then build a VB project which I could effectively use to breakdown the file into its parts. That is, I can strip out:

    - the cell addresses by type

    - sheet names

    - macros with code

    - links

    -etc

    Thanks

    EM

  • ExcelMonkey (3/31/2009)


    Thanks for the response. When you say it would take weeks to do, why is this? Where is the time being consumed? Is this assuming many files or is this assuming the coding to build the solution?

    In both cases the time is for the develpoment effort to delineate each object type & property, design & create the tables to hold them, write the code to extract & store them, test & debug all of this. It does not include the operational time to execute all of this, nor the time to develop the re-assembly code.

    Your alternative option was to use Excel automation. I am assuming I can then build a VB project which I could effectively use to breakdown the file into its parts. That is, I can strip out:

    - the cell addresses by type

    - sheet names

    - macros with code

    - links

    Yes. Though I prefer to do this kind of project in Excel macros themselves. And I am not sure extracting the macros through automation, I have never tried that.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • When you say "The only quick & dirty method would be to store the spreadsheets into SQL server without breaking them down." is this suggesting that you simply dump the contents of the spreadsheets to SQL without an heirarchical structure? I am trying to understand how you are storing it without breaking it down?

    Thanks again

    EM

  • It's an XML document. SQL Server 2005 has an XML datatype. So you just create a table with a column of the XML datatyoe and then load them into that table. You can even through some XML indexes onto it if you want to query it.

    This approach also skips the re-assembly step, you just extract it to an Excel XML file.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • You know, the things is, I do not actually know how you get Execl into that XML format. The ".xslx" are not XML, the are compressed XML, and I do not know how to uncompress them to real XML. Does anybody else?

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Ah OK. apperently they are really ZIP folders, sou you can just unzip them to a folder tree of XML files.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • So the advent of the this new XML format for Excel 2007 has created a standardized format (Office Open XML) for Excel documents. Now I take this to mean that this and the other standard format (Open Document Format ODF) make developers' lives easier as there is a standard XML format for spreadsheets.

    So I assume this means that developers no longer need to write code to convert spreadsheet files to XML. However, as you stated they need to "delineate each object type & property, design & create the tables to hold them, write the code to extract & store them and reassemble them".

    Are their third party components for this? I did come across this: http://www.gemboxsoftware.com/GBSpreadsheet.htm.

    Thanks again

    EM

  • Sorry, I do not know the answer to that last question.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply