SQLServerCentral Article

Linked Servers: a Peek Behind the Curtain

,

We recently received a support call from a Gallium Data user. He was executing queries on his instance of SQL Server, but these queries were getting mysteriously changed, which caused his logic in Gallium Data to not run as expected. What was going on?

Enter Linked Servers

This issue was caused by the use of of linked servers. This is a powerful feature of SQL Server that allows you to stitch multiple databases together (not just SQL Servers) and run queries that span multiple database servers, thus creating a distributed database.

To create a linked server, you run the sp_addlinkedserver stored procedure, for instance:

EXEC master.dbo.sp_addlinkedserver
@server = N'MyLinkedServer',
@srvproduct = N'',
@provider = N'SQLNCLI',
@datasrc = N'192.168.1.69,1432', -- IP address and port number of the destination server
@catalog = N'AcctDb'; -- database name

and you now have a connection to the specified SQL Server instance. There are different types of linked servers, so there is quite a bit more to it than this, but you get the gist.

This new linked server will then appear in e.g. SSMS:

We'll skip over the question of authentication: there are many options, you can map one account to another, etc.

Once that's all in place, you can run queries that include tables both on the server itself and on linked servers, e.g.:

SELECT cust.name, acct.balance
FROM custdb.customers cust, MyLinkedServer.AcctDb.dbo.Accounts acct
WHERE acct.custid = cust.id

And it all behaves exactly as you'd expect.

The interesting thing is what happens between these two databases when you run this query.

What's going on down there?

Looking at the network traffic between the two database servers, we see that the main SQL Server instance does a lot more than just forward that query to the linked server. The full exchange is not necessarily run for every request, but if this is the first request to that linked server, the flow is as follows.

First, the main SQL Server instance opens a connection to the linked server and authenticates. Depending on how authentication has been set up, there can be several exchanges of various keys and tokens, but ultimately, a connection is opened to the linked server.

After that, the main server will ask a few questions of the linked server:

  • what is my session ID?
  • what kind of collation are you using?
  • what is the schema information for the tables involved in the query? This includes column definitions, indices and constraints

The main server then calls sp_reset_connection to make sure the connection is clean, and SET XACT_ABORT OFF to avoid rolling back the transaction if an error is detected

At that point, the main server sends a transaction manager request TM_BEGIN_XACT to open a distributed transaction. Now we're cooking with gas! But we're not done yet.

The next call is to sp_table_statistics2_rowset, an undocumented stored procedure that (obviously) returns some statistics about the tables participating in the query. This is followed by a call to SHOW STATISTICS, again to get some more information about the tables we're about to access.

Clearly, at this point, the main server is preparing a query plan, so it needs to know the lay of the land in the linked server. The next call is a transaction manager request to roll back the transaction: TM_ROLLBACK_XACT. Then there is another call to sp_reset_connection and SET XACT_ABORT OFF, and we start another distributed transaction with TM_BEGIN_XACT.

We now have a call to sp_schemalock, which is an internal stored procedure that makes sure that the schema does not change while the query is prepared and executed.

Are you still with me?

At last!

And now, finally, the main server executes our query. Except, of course, it can't be the query as submitted by the user, because that query spans multiple servers. The main server figures out which part(s) of the query need to be executed on the linked server, and formulates it in a way that can be executed by the linked server.

If you'll recall, the request we executed on the main server was:

SELECT cust.name, acct.balance
FROM custdb.customers cust, MyLinkedServer.AcctDb.dbo.Accounts acct
WHERE acct.custid = cust.id

This is what gets executed on the linked server:

SELECT "Tbl1002"."custid" "Col1001","Tbl1002"."balance" "Col1005"
FROM "AcctDb"."dbo"."Accounts" "Tbl1002"
WHERE "Tbl1002"."custid" = @P1

Obviously, your mileage will vary a great deal here. The main server is going to formulate a query that it believes is most appropriate based on the nature of the query, the statistics for the various tables involved in the query, and probably other factors. The query analyzer has a will of its own, and it usually does an excellent job.

Depending on the query, the main server may need to execute this statement many times to retrieve all the relevant data. It will join all that data as required to satisfy the original query, and a result set will be assembled and returned to the client. This is all completely transparent to the client.

In fact, this reformulation of the query happens even if you run a simple query that only involves the linked server. For instance:

SELECT id as acc_id, custid, balance
FROM [MyLinkedServer].[AcctDb].[dbo].[Accounts]

gets sent to the linked server as:

SELECT "Tbl1002"."id" "Col1004","Tbl1002"."custid" "Col1005","Tbl1002"."balance" "Col1006"
FROM "AcctDb"."dbo"."Accounts" "Tbl1002"

And that explains why our Gallium Data user was confused -- his queries, when received by the linked server's proxy, did not look like what he was executing on the main server, but that's just the way SQL Server works.

Cleanup

After the query has completed, the main server cleans everything up:

  • any prepared statements that may have been created are closed
  • the schema lock is released with a call to sp_releaseschemalock
  • the distributed transaction is closed with a call to TM_ROLLBACK_XACT (in the case of a select - obviously a statement that modifies data would usually invoke TM_COMMIT_TX)
  • the connection is reset with a call to sp_reset_connection

And the request is over. If you're feeling a little dizzy, I don't blame you.

The connection to the linked server is usually kept open for a while, because it's fairly expensive to establish, but it will get closed after a period of inactivity.

That's a lot of stuff!

Distributed transactions are notoriously hard to do correctly. I hope this peek behind the curtain has given you a better appreciation for how much work SQL Server does for you. This is not easy, but the folks at Microsoft have done an excellent job of making it transparent to their users. If the linked servers are full-fledged enterprise databases (Oracle, DB2, PostgreSQL, etc...), you can run distributed transactions with little difficulty.

We've only looked at two SQL Server instances working together, but linked servers can also be for different types of databases: there are providers for CSV files and spreadsheet files, for some NoSQL databases, Salesforce, and so on. With these providers, distributed transactions are usually not an option, since these data sources tend to be non-transactional, but they still allow you to run distributed queries easily.

It's not a bad idea to have at least a sense of how this is all done, because in your linked servers, you may see some activity that you did not expect. It's not unheard of, for instance, to get a schema lock that does not get properly released, leading to potential trouble. You may also see some queries that no one has specifically asked for -- they may be caused by another server sending queries to your server as a linked server.

Rate

5 (6)

You rated this post out of 5. Change rating

Share

Share

Rate

5 (6)

You rated this post out of 5. Change rating