Linked Server Performance and TempDB

  • So I'm doing a fairly straight-forward insert query involving tables/joins between a local db and a db on a linked server (I have no control over the linked server), inserting data into a local table.. Suddenly, the query fails with...

    Msg 1105, Level 17, State 2, Line 1

    Could not allocate space for object 'dbo.SORT temporary run storage: 161991253140789' in database 'tempdb' because the 'PRIMARY' filegroup is full. Create disk space by deleting unneeded files, dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup.

    This is a very seldomly maintained shared dev environment so the above problem (while still a mystery to me) isn't my main concern. What is concerning me is how linked servers affect performance. Two questions come to mind:

    1. When querying large amounts of data across the linked server, is it better to "stage" a subset of the data locally and then query it? I tend to shy away from making copies of data, even if only temporarily.

    2. When doing joins using keys that exist both locally and remotely, is it better to use the fields on the local table or the remote table? Example:

    SELECT

    t.AccountNumber

    ,t.CustomerID AS LocalCustomerID

    ,c.CustomerID AS RemoteCustomerID

    ,t.AddressID AS LocalAddressID

    ,a.AddressID AS RemoteAddressID

    ,a.Address

    FROM

    dbo.MyAccounts AS t

    INNER JOIN

    [MyLinkedServer].CustomerInfo.dbo.Customer AS c

    ON t.CustomerID=c.CustomerID

    INNER JOIN

    [MyLinkedServer].CustomerInfo.dbo.CustomerAddress AS a

    ON a.AddressID=c.AddressID --OPTION 1: Using the Remote fields in the join

    --ON a.AddressID=t.AddressID --OPTION 2: Using the local fields in the join

    In the above example, which of the last two lines ("ON a.AddressID...") performs better...or does it matter?

    Any tips would be appreciated.

  • Try this way:

    SELECT

    t.AccountNumber

    ,t.CustomerID AS LocalCustomerID

    ,c.CustomerID AS RemoteCustomerID

    ,t.AddressID AS LocalAddressID

    ,c.AddressID AS RemoteAddressID

    ,c.Address

    FROM

    dbo.MyAccounts AS t

    INNER JOIN OPENQUERY(MyLinkedServer, '

    SELECT rc.CustomerId, rc.AddressId, rc.Address

    FROM CustomerInfo.dbo.Customer AS rc

    ON ra.AddressID=rc.AddressID

    ') AS c

    ON t.CustomerID=c.CustomerID

    When working with linked servers, always try to do as much as you can remotely, possibly with openquery. Select only the fields you need and filter the remote results as much as possible.

    When you use four-part names ([linked server].[remotedb].[schema].[object]) the engine will probably decide to take a copy of the whole remote object in tempdb. If you use openquery, the command passed as second argument is executed "as is".

    Hope this helps

    Gianluca

    -- Gianluca Sartori

  • It makes a copy of the remote object in TempDB? Including the data? Wow, that would be bad. My remote database is HUGE!!

    PassThrough query suggestion makes sense. But given that my remote tables are HUGE, that there is a 1-to-n relationship between the tables, and that there isn't any real limiting criteria in the remote query, won't that make the join results HUGE x n? Seems worth considering, especially if you are right and that a copy of the remote object is created locally in TempDB.

    As for the joins: What's interesting to note is that the query fails with that error when I use the fields in the remote database (Option 1 in my original example) for my join, but not when I use the key fields in the local database (Option2 in my original example).

  • Uncle Moki (7/17/2009)


    won't that make the join results HUGE x n?

    No, if you just specify the columns you need. In that case results are smaller.

    Uncle Moki (7/17/2009)


    As for the joins: What's interesting to note is that the query fails with that error when I use the fields in the remote database (Option 1 in my original example) for my join, but not when I use the key fields in the local database (Option2 in my original example).

    Try looking at the query plan and the properties for the remote query, I think you will find the answer there.

    Good Luck!

    Gianluca

    -- Gianluca Sartori

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply