Optimizing Linked Server Queries

Question

Optimizing Linked Server Queries

Steve Jones - SSC Editor

SSC Guru

Points: 734434
More actions
January 13, 2016 at 2:24 pm

#324024

Perhaps a few things, but if someone could show some real LinkedServer queries and then a few ways to make them run faster.
Potentially you have queries that move data to the linked server, maybe join with data from a linked server.

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply

Hugo Kornelis SSC Guru Points: 64780 More actions · Answer 1

I have had some very unpleasant experience with linked server queries recently. Unfortunately, I have not discovered any silver bullet.

When troubleshooting linked server performance, always study the execution plans in detail. Look at the operators that do the remote work. Open their properties and study the query that is pushed to the other server. (Yes, it will be an awful task to understand because all column names are aliased with meaningless numbers and the generated query looks terrible, but you'll still have to).

In a normal query, the local execution plan is the only possible source of issues. In a remote query, there are three potential sources of misery:

1. The local execution plan - this is the part that most people that already have some tuning experience will get fairly easily by studying the execution plan

2. The remote execution plan(s) - you do not see this, so you will have to look at the properties of remote operators, find the query, copy/paste it into a query window that connects to the remote server and then look at its execution plan

3. Network overload. I have seen execution plans where all rows of a one-million-row table are pushed over the network, and then a filter throws away all except the one it needs. More on that below.

You will have to look at each of these in isolation and combined. E.g., a remote query that takes 0.2 seconds to execute on the remote server is not a problem if it is executed once, but if it is sitting in the inner input of a nested loops join and the actual execution count shows that it was invoked a miillion times, then you have just found an explanation for one million * 0.2 seconds = 55.5 hours execution time.

re: point 3 - the case where I found a full table being pushed over the network to retrieve a single row was caused by collation differences. Because the remote server had a different collation than the local server, the optimizer decided that it could not trust the remote server to filter on a string column - so instead of sending the query with the WHERE condition to the other server (where an index seek would have been used), it decided to pull all rows over the network and filter locally (without the benefit of an index, because only the data is transported over the network).

I was able to fix this for a SELECT query by fiddling with the linked server properties. For an UPDATE statement, though, I did not find the properties required to change this; in the end I had to ask the developers to revise that particular bit of logic. (Which they did by changing SET AuditColumn = USER_NAME() to SET AuditColumn = 'xxx' - because they knew that this logic would only ever be executed in the security context of xxx.)

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/