Access Excel Hyperlinks from SQL

  • Hi,

    I have to import excel documents into a SQL table, that part is simple enough and can easily be achieved by using OPENROWSET().

    The problem I am having is that the client wants to see the hyperlinks they have applied to cell as an extra column in the table:

    Example:

    [sheet1]

    Cell A1: Contains Word "Google" and has a hyperlink that opens "http://www.google.com/"

    Cell A2: Contains Word "Bing" and has a hyperlink that opens "http://www.bing.com/"

    I need to insert this in to table "Search Engines"

    Column 1 (row 1)= 'Google'

    Column 2 (row 1)= 'http://www.google.com/'

    Column 1 (row 2)= 'Bing'

    Column 2 (row 2)= 'http://www.bing.com/'

    I can find the following possible solutions when I Google the problem:

    VB Macro to expose the hyperlink in another cell inside the workbook.

    Manually copy and pasting the hyperlink into new cells.

    c# code to do the import.

    The problem with these solutions are:

    C# is not a option.

    and the other two would require me to manually go through roughly 12000 excel files with a average of 15 worksheets each to extract the hyperlinks.

    Is there anyway to get hold of the hyperlink using SQL or for that matter any file I can execute via sqlcmd on the excel workbooks to copy the hyperlinks to another cell without having to manually open and edit each book?

    Thank in advance.

    Philip

  • C# is not an option :crying:

    Here is a mockup of some code for a Script Task that reads a Worksheet row-by-row and accesses the Hyperlinks collection for each row. I am no Excel expert and this only took me about 15 minutes to mock up. I am sure it could be easily extended to suit your needs:

    [font="Courier New"]using Microsoft.Office.Interop.Excel;

    //...

    public void Main()

    {

        Microsoft.Office.Interop.Excel.Application excelApp = new Microsoft.Office.Interop.Excel.Application();

        excelApp.Visible = true;

        Workbook workbook = excelApp.Workbooks.Open(@"C:\@\Book1.xlsx",

            Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing,

            Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing,

            Type.Missing, Type.Missing, Type.Missing, Type.Missing);

        // The key line:

        Worksheet worksheet = (Worksheet)workbook.Worksheets["Sheet1"];

        foreach(Range range in worksheet.Rows)

        {

            foreach (Hyperlink hyperlink in range.Hyperlinks)

            {

                string hl = hyperlink.Address;

            }

        }

       Dts.TaskResult = (int)ScriptResults.Success;

    }

    [/font]

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Hey opc.three,

    Thanks for the reply.

    This sparked anther thought that I would like to run by you and anyone else who might have more insight into CLR Procedures.

    Would it be possible to use a CLR Procedure to copy the hyperlink text to another cell for each row and then save the excel file?

    Regards,

    Philip

  • Philip-1144230 (1/10/2013)


    Hey opc.three,

    Thanks for the reply.

    This sparked anther thought that I would like to run by you and anyone else who might have more insight into CLR Procedures.

    Would it be possible to use a CLR Procedure to copy the hyperlink text to another cell for each row and then save the excel file?

    Regards,

    Philip

    I would not recommend putting anything to do with Office interop into the SqlClr.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Would you mind elaborating?

    I have never worked with CLR Procedures, I've only read about them and the possibilities that they provide.

    I'm quite new to SQL and all it's abilities and would like to gain understanding other than just a plain "how to" or "not quite" if that makes sense to you.

    Apologies if I am being a nuisance.

  • Not at all. My response was admittedly a bit Spartan.

    Office objects are designed to be used on a client system, not a server, and they are notorious for having memory problems and can make any system unstable. This is not to mention that using them would force you to mark your SQLCLR objects UNSAFE, which I never condone. The SQLCLR is most conservatively used to extend the Transact-SQL language (i.e. all objects can be marked SAFE) and do not need to reach anything outside the database engine itself (i.e. no need to access the file system, the internet, network resources, DLLs or assemblies external to what the SQLCLR provides by default etc.). SQLCLR is not an open offer to implement full-scale .NET functionality that would access resources outside the database engine. If you are developing an object and find it needs to be marked EXTERNAL_ACCESS or UNSAFE it is a sign you should go back to the drawing board and try to find a different way. I may hold an overly conservative view on the topic according to some but I prefer to keep servers hosting applications (.NET, Java, etc.) and servers that host SQL Server databases on distinct and separate physical entities (servers) that do not share any resources. Resist the urge to turn SQL Server into an application server; that is what Windows and stand-alone .NET applications (of which SSIS is a type) were designed for.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Thank you, Appreciate the effort.

    Taking into consideration what you've just said, it's forcing me to rethink what I am trying to achieve and that's always a good thing. :unsure:

    It does however enforce my original post.

    What I am creating now will be used in future to import excel documents into their database and on their demand has to be a sql procedure, which does limit my options considerably.

    Would I be able to access a cells hyperlink through SSIS?

    Thanks again.

  • Disregard the previous question, answer is located here:

    http://social.msdn.microsoft.com/Forums/en/sqlintegrationservices/thread/f863db04-c84e-4283-8e31-565b46f0803d

    Would you consider SSIS to be a better practice?

  • Sure. I developed the C# code I posted inside an SSIS Script Task. You'll just need to add a reference to Microsoft.Office.Interop.Excel.dll in the Script Task VSTA Project (what opens when you click Edit Script) to get started.

    That said, I would urge you to challenge the piece of the requirement saying this functionality must be available by calling a stored procedure unless they want to move to SQL 2012 where SSIS packages can be securely executed using T-SQL. Kicking off an SSIS package in SQL 2008 R2 and below from within a T-SQL stored procedure implies you'll have to enable xp_cmdshell to be able to call dtexec.exe and that is a dealbreaker in my book. The lesser of the evils would be to create a SQLCLR proc marked for EXTERNAL_ACCESS that will call dtexec.exe.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Philip-1144230 (1/10/2013)


    Disregard the previous question, answer is located here:

    http://social.msdn.microsoft.com/Forums/en/sqlintegrationservices/thread/f863db04-c84e-4283-8e31-565b46f0803d

    Would you consider SSIS to be a better practice?

    I would not put much stock in that post. The suggestion that macros are somehow safer than accessing a Worksheet using the Excel object is unfair. They are both dangerous. As I said in my previous post I would not want to host this operation on the server hosting my databases. Steer the requirement provider towards putting this onto an application server and keep it away from and out of SQL Server.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Valuable input is always welcome.

    Seems I'll have to complete some SQL courses in the near future along with a lot of practical examples before I take on another SQL based project.

    That said, I'll still have to solve this one.

    Thanks again for all the help, appreciate it.

  • Anytime, good luck. If you have a moment to circle back, I would be interested to hear about where you landed, and how you got there. Of course if you hit a snag post back, or make a new thread. Here to help.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Philip-1144230 (1/10/2013)the other two would require me to manually go through roughly 12000 excel files with a average of 15 worksheets each to extract the hyperlinks

    Philip, it sounds like you have already resolved to make some changes regarding future requests, but that you still have to solve this problem. VBA is very powerful, and from it you can do any kind of file handling. It would be relatively easy to loop through your files, flip through the pages, and expose the links. Granted, it would probably take a little while for the script to run. This would be a good option, *if* the links are *always* in the *exact* same location. If you want to go this route, there are lots of resources out there to help get you started.

    Greg
    _________________________________________________________________________________________________
    The glass is at one half capacity: nothing more, nothing less.

  • Greg Snidow (1/10/2013)


    Philip-1144230 (1/10/2013)the other two would require me to manually go through roughly 12000 excel files with a average of 15 worksheets each to extract the hyperlinks

    Philip, it sounds like you have already resolved to make some changes regarding future requests, but that you still have to solve this problem. VBA is very powerful, and from it you can do any kind of file handling. It would be relatively easy to loop through your files, flip through the pages, and expose the links. Granted, it would probably take a little while for the script to run. This would be a good option, *if* the links are *always* in the *exact* same location. If you want to go this route, there are lots of resources out there to help get you started.

    The C# code I provided does exactly that (flips through all cells in a Worksheet). And you are right, it is painfully slow. If you know the column and defend against the possibility of there not being a hyperlink on the cell it's not tough code to write and could perform acceptably if the scope were limited. The question of where to host the code though, that's the big decision on this one in my opinion.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Hi Guys,

    I've gone back to the client with regards to it specifically being a sql procedure that needs to do the import.

    Waiting on their response now before I go any further with this.

    Thanks for all the responses up till now.

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply