Why Shouldn't I use Linked server?

  • Hello Everyone,

    I have 4 SQL database servers (all using SQL Server) and need to communicate between them for lookup operations in my stored procedures. While we use Replication and SSIS at few places our legacy code primarily uses linked servers.

    Recently , I got a new set coding guidelines from my architects and they insisted we should not be using linked server. I am not aware of any issues with LinkedServer Performance apart from memory leaks in older versions ( below SQL 2008 SP1) .

    While I will be working with my architect team to understand their reasoning behind this , I wanted to get check with this group on any reasons why I should be moving away from Linked server.

    Note : I am aware of alternatives for Linked Server but I am trying to understand the negative impacts of using Linked server

  • While there are performance reasons for not using linked servers, I suspect the main reason your architect team does not want them used is if the link goes down then your application will stop.

    Your architect team wants to follow the principle of loose coupling. I would follow their advice.

  • I just went through something like this yesterday. It looked like we needed to make local copies of some of the databases (replication, some other method, whatever) because of supposed performance issues. In particular, there was one query that seemed to be the cause of folks hair bursting into flames. It took 4 minutes and 48 seconds to join 2 remote tables to 1 local table and 1 of the remote tables had to be joined twice.

    I rewrote the query and it now operates in less than a second.

    It'll be interesting to see what your "architects" come up with. I've found that good code is almost always the simplest and best way to go.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • If your application needs to join Customers with PurchaseOrders, and your Customer table on ServerA, and your PurchaseOrder table on ServerB, then that architectural decision itself has forced you down the path of restorting to linked server connections.

    So, from an architectural perspective, the issue isn't whether or not you should use linked server connections; the issue is whether or not commonly related data should be split across multiple servers. There are some infrastructure architects who know a lot about infrastructure (virtualization, networking, storage, web services, etc.), but when it comes to logical and physical database design; they don't get it. They're not true database architects.

    What I typically do (as an alternative to 4-part remote joins) is leverage the EXEC .. AT.. syntax to perform a pass-through style query, pulling the remote data I need into a temporary tables, and then I join the local table with the temporary table. This is especially useful when the amount of remote data required is relatively small.

    For example:

    INSERT INTO #PO ( CustomerID, PurcahseAmount )

    EXEC ('select CustomerID, PurcahseAmount from PurchaseOrders') AT ServerB;

    SELECT ...

    FROM Customers AS C

    JOIN #PO ON #PO.CustomerID = C.CustomerID

    ...

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell (12/20/2016)


    If your application needs to join Customers with PurchaseOrders, and your Customer table on ServerA, and your PurchaseOrder table on ServerB, then that architectural decision itself has forced you down the path of restorting to linked server connections.

    So, from an architectural perspective, the issue isn't whether or not you should use linked server connections; the issue is whether or not commonly related data should be split across multiple servers. There are some infrastructure architects who know a lot about infrastructure (virtualization, networking, storage, web services, etc.), but when it comes to logical and physical database design; they don't get it. They're not true database architects.

    What I typically do (as an alternative to 4-part remote joins) is leverage the EXEC .. AT.. syntax to perform a pass-through style query, pulling the remote data I need into a temporary tables, and then I join the local table with the temporary table. This is especially useful when the amount of remote data required is relatively small.

    For example:

    INSERT INTO #PO ( CustomerID, PurcahseAmount )

    EXEC ('select CustomerID, PurcahseAmount from PurchaseOrders') AT ServerB;

    SELECT ...

    FROM Customers AS C

    JOIN #PO ON #PO.CustomerID = C.CustomerID

    ...

    I don't use 4 part naming either. I create 2 part synonyms for the 4 part objects so that code changes aren't needed if anything changes with the linked server name or definition.

    And I agree with the other things you said. Pulling data into a local temp or scratch table on demand can be much more efficient even with some relatively large tables.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Fair point Jeff. There shouldn't be really any doubt that if we replicate tables locally and use local tables in our join then performance will be definitely be better than pulling the data over the network during the proc execution.

    Though its a big deal to set up replication and maintain it but yet the overall performance will be much better and as long view as our data volume increases , this will be something really required.

    Eric , Your solution to pull the data locally makes sense to me and I will keep that in mind if I am using small tables. But currently my dataset is big , so I guess pulling it locally might result in more issues.

    Thanks Jeff and everyone else who made it easy to understand this 🙂

  • In the nutshell, it is well said above that using Replication or other option is better to be used in place of linked server. Just adding one more thought that at some point, we cannot always prefer Replication above linked server. Could anyone please list couple of scenario where one will like to use linked server instead of replication?

    Regards
    VG

  • Distributing tables vertically (ie: ServerA contains a subset of data for current year and ServerB contains archive data same tables) can be an effective scale-out solution (depending on the access patterns of the queries), but horizontal partitioning (ie: Customer table on ServerA and PurchaseOrders on ServerB) is probably a dumb idea that will be very un-scalable.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • SQL Learner - VKG (12/21/2016)


    In the nutshell, it is well said above that using Replication or other option is better to be used in place of linked server. Just adding one more thought that at some point, we cannot always prefer Replication above linked server. Could anyone please list couple of scenario where one will like to use linked server instead of replication?

    I suppose two major factors in what solution to use are how frequently you need to query the remote data, and how up to date you need the remote data to be. In Shekhu's case, it looked like it would be queried frequently and needed as up to date as possible so replication makes a good fit.

    In some other cases, say the source data doesn't change very often, then replication may be overkill to setup and maintain, you could just synchronize the local copy of the remote data some other way using regular TSQL commands over a linked server. Another case for this would be where the remote data isn't needed as often, for example in a data warehouse I only need to synchronize data once a day from the OLTP system into my ODS, so I do that across a linked server once and then run all my delta scripts for dimensions and facts off of ODS.

  • Eric M Russell (12/20/2016)


    If your application needs to join Customers with PurchaseOrders, and your Customer table on ServerA, and your PurchaseOrder table on ServerB, then that architectural decision itself has forced you down the path of restorting to linked server connections.

    So, from an architectural perspective, the issue isn't whether or not you should use linked server connections; the issue is whether or not commonly related data should be split across multiple servers. There are some infrastructure architects who know a lot about infrastructure (virtualization, networking, storage, web services, etc.), but when it comes to logical and physical database design; they don't get it. They're not true database architects.

    What I typically do (as an alternative to 4-part remote joins) is leverage the EXEC .. AT.. syntax to perform a pass-through style query, pulling the remote data I need into a temporary tables, and then I join the local table with the temporary table. This is especially useful when the amount of remote data required is relatively small.

    For example:

    INSERT INTO #PO ( CustomerID, PurcahseAmount )

    EXEC ('select CustomerID, PurcahseAmount from PurchaseOrders') AT ServerB;

    SELECT ...

    FROM Customers AS C

    JOIN #PO ON #PO.CustomerID = C.CustomerID

    ...

    Personally, I hate replication. Sometimes it's just better to copy the data on demand. If it's for daily reporting, then you only need to do it once a day.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • SQL Learner - VKG (12/21/2016)


    In the nutshell, it is well said above that using Replication or other option is better to be used in place of linked server. Just adding one more thought that at some point, we cannot always prefer Replication above linked server. Could anyone please list couple of scenario where one will like to use linked server instead of replication?

    First, it's not always a "better" option especially when it comes to replication.

    As for the alternative, it's sometimes better to either copy the table on the fly or do it during a nightly job if the data doesn't need to be up-to-the-minute.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Chris Harshman (12/21/2016)


    SQL Learner - VKG (12/21/2016)


    In the nutshell, it is well said above that using Replication or other option is better to be used in place of linked server. Just adding one more thought that at some point, we cannot always prefer Replication above linked server. Could anyone please list couple of scenario where one will like to use linked server instead of replication?

    I suppose two major factors in what solution to use are how frequently you need to query the remote data, and how up to date you need the remote data to be. In Shekhu's case, it looked like it would be queried frequently and needed as up to date as possible so replication makes a good fit.

    In some other cases, say the source data doesn't change very often, then replication may be overkill to setup and maintain, you could just synchronize the local copy of the remote data some other way using regular TSQL commands over a linked server. Another case for this would be where the remote data isn't needed as often, for example in a data warehouse I only need to synchronize data once a day from the OLTP system into my ODS, so I do that across a linked server once and then run all my delta scripts for dimensions and facts off of ODS.

    Nice update. Does location of data center effects replication or linked server? One server is one state and other is in another state.

    Regards
    VG

  • SQL Learner - VKG (12/22/2016)


    Nice update. Does location of data center effects replication or linked server? One server is one state and other is in another state.

    With a linked server solution in this case you'd still definitely want a local copy of the data, synchronized periodically, so that each query wouldn't have the latency of going over the WAN to retrieve data. With periodic synchronization using TSQL commands vs. replication you'd still probably make a decision by how current you need the local copy of the data to be.

  • Makes Sense. Thanks.

    Regards
    VG

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply