Using OpenRowset to Insert Data

  • Hi Guys,

    I've got some data I need to copy to a table on a different server, a convenient way for me to do it, is with the following:

    INSERT

    OPENROWSET

    (

    'SQLNCLI',

    'Server=blahblah\bluergh;Database=adventureworks;Uid=rickGrimes;Pwd=!h3yC0ral!!;',

    'select * from adventureworks.dbo.UserLogArchive'

    )

    SELECT *, 'UK' from myOtherDB..USER_LOG (NOLOCK)

    WHERE month([date]) =12 AND year([date]) = 2013

    I am not asking about how to identify the data as I am just using a dummy query in the final select above (the select not in the openrowset)

    But what I am wondering if that SELECT within the openrowset is REALLY going to first return select * from the link first. Would I better doing select top 1 *

    What actually happens behind the scenes of inserting into an openrowset

    I only have to do this for a few month, and then it never needs to run again, so not seeing a need for an SSIS (I don't want to BCP because of blob data), but was wondering if using the openrowset is doing something rather silly in the background.

    Cheers guys

    Alex

  • Hopefully, that's not a real UID and PW that you used in your post.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Nope, purely a placeholder ... Aimed at any Walking Dead / Honest Trailer fans that might be reading.

  • To answer my own question, for future readers:

    Insert via OpenRowset() for 15 million rows = 3.5 hours

    Insert via SSIS package for identical query and results = 15 minutes.

  • Good call. I have always found linked server/ openrowset slower and resort to it only in case of emergencies. Thanks for posting the statistics.

    Just curious, why are you not executing the insert statement in blablah server with openrowset from otherdb with a filter?

    ----------------------------------------------------------------------------------------------------------------------------------------------------
    Roshan Joe

    Jeff Moden -Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • Hiya!

    Well, that was why I posted that question, to determine whether the OpenRowset() with that usage would evaluate the actual query inside it. I assumed not as if that were the case then as the table got bigger (From Zero rows to 100million rows) the insert should take longer for each batch. Every time 15 million rows were appended the SELECT inside the OpenQuery has more and more work to do. But I was not seeing that, it was a static 3hours regardless of how much data in in the table. And that 3 hours seems drastically long for what wasn't a particularly ugly table. Just ints, decimals and a small varchar. But it was going from a non-compressed table into a compressed table.

    That being said, its still massively faster to not use OpenRowset despite being more convenient in terms of scripting, and keeping my hands on the keyboard 🙂

  • This link has helped me understand strategies for linked servers (and it might be that openquery to remote servers might be impacted also).

    http://sqlblog.com/blogs/linchi_shea/archive/2010/12/01/linked-server-and-performance-impact-direction-matters.aspx

    Bottom line, faster to select from remote servers than insert to remote servers by quite a bit.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply