Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12

Linked Server Optimization Expand / Collapse
Author
Message
Posted Friday, March 7, 2008 7:51 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Thursday, August 28, 2014 7:52 AM
Points: 41, Visits: 375
We had the same problem and I used a table-valued function to limit the data being transferred. In this way, the join was on the function and with a smaller results set being transferred over the network. The speed increase was huge in my case.

Ex. (from my bad memory)
Select ...
From vwLocalTable l
Inner Join fnGetRemoteData(l.filterVal) r
on l.key = r.key
where l.filtercol = "something"

We put this whole thing in a view to help to abstract the hole problem with linked server specs in sql.

Hope this helps.
Post #465862
Posted Friday, March 7, 2008 8:22 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Tuesday, September 25, 2012 1:46 PM
Points: 4, Visits: 27
The situation described is our production architecture. We deal with this issue on an ongoing basis and have had various successes and failures.

The issue, as we have defined and observed it, is that when you join in a table on a remote server the remote server returns the entire dataset to the requesting server and then that dataset is optimized into the local query. So if I join in a table with 50,000 rows, even though in the ON clause I have filtered to only needing ONE row, the remote server will return a dataset with all 50,000 rows and then the local server will use the dataset in an optimized local query.

So the solutions offered do address the underlying question: How do I get the remote server to only return the rows that I actually want, instead of a complete dataset of the entire table?

Temp tables, OPENQUERY, etc are fine IF you have the filter criteria when you create the temp table or OPENQUERY. But if the filter criteria for the remote is derived from the values of several other columns from several other tables in the query itself, then temp and openquery crumble away as my filter criteria from columns in other tables increases. My understanding is that the remote server is returning a temp dataset any way, so this can only be more effective if you have and can use filter criteria.

So what's a developer to do!??!!??

Here is what we do that has proven results. We use a join clause with a sub-select statement. In the example below TBLA is local and TBLB is remote.

Example:

Select TBLA.cola, TBLA.colb, TBLA.colc, TBLB.cola, TBLB,colb From TBLA
inner join (Select cola, colb, colx, coly, colz From Remote.TBLB_Remote) TBLB
On TBLB.colx = TBLA.cola and TBLB.coly = TBLA.colb and TBLB.colz = TBLA.colc

When you use a join with a sub-select statement the remote query is filtered by the criteria in the ON clause of your join and instead of the remote sending back the entire dataset of the table, it only sends back the rows you actually need, but of course it makes a trip to the remote server for every row in TBLA.

If you can put a where clause on the sub-select to further limit the dataset, even better!

If your remote server is colocated in the data center with the local server (as in our case) and/or you have sufficient WAN bandwidth, this method is very efficient.

If the local server is located in Atlanta and the remote is located in Chicago and you have a DSL between locations, this would not be good. But nothing would be good in that case! :D

In essence you must choose between the load of a large amount of data being transmitted across the WAN against multiple trips with a small amount of data. Usually the latter wins.

Also if you NEED the majority of the rows in TBLB, go ahead and do a regular join and get them and let the local server optimize the remote dataset into the local query. In other words, if I am selecting 50,000 rows from TBLA and I have one or more matching rows in the remote TBLB for every row in TBLA, go ahead and get all the rows from TBLB at one time because your going to need them. In our experience, this is not usually the case.

Just one last point. Our development environment is not distributed. We take all the databases from all of our SQL servers and put them on a single development SQL server. Using the method I described above, when you are not dealing with remote, actually makes the query run slower when you are not in a remote environment. This makes the efficiency of new development queries that are intended to work cross-server difficult to measure.

Just another option to put in the toolbelt!

Well, nuff said.


Donald
Post #465904
Posted Friday, March 7, 2008 8:57 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Saturday, October 12, 2013 10:42 PM
Points: 263, Visits: 297
I think the linked servers are not the only culprit here. I actually cringed when I saw these queries.

The first query is JOINing and WHEREing using fuctions. And [I believe] those functions will cause scans, right ?

And the second query uses a SELECT INTO. We have seen TempDB access slowed down and even locked up by using SELECT INTOs.
Post #465943
Posted Friday, March 7, 2008 9:33 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Wednesday, July 16, 2014 12:43 PM
Points: 222, Visits: 242
OpenQuery is a good way to go as well but this requires that this method is opened up to more than just the sysadmin role, something that we may not want to do for all users. Also you have to make sure that the OLEDB provider being used has OPENQuery enabled for non admin roles.

Steve
http://stevefibich.net
http://utilitydb.codeplex.com/
Post #465977
Posted Friday, March 7, 2008 2:17 PM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Wednesday, August 27, 2014 1:07 AM
Points: 2,901, Visits: 1,805
The thing that worries me with Linked Servers is the security aspect.

Our SQL2000 boxes won't delegate but our SQL2005 boxes work just fine.

We have trawled through just about every obscure knowledgebase article on delegation to find out why.

If you can't get delegation to work and use a fixed account then you are going to have a security risk with anyone who can log onto your local box. they can do whatever the linking account allows them to do on the remote box!



LinkedIn Profile
Newbie on www.simple-talk.com
Post #466133
Posted Friday, March 7, 2008 4:23 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Wednesday, March 9, 2011 10:38 AM
Points: 17, Visits: 158
Good article ...

I use linked servers alot and have had performance issues as well, however my queries did not return a whole lot of data (they just took long to run). Even though my queries were optimized (i.e. they used index seeks when run directly on the remote server), they would produce table scans when I ran them as linked server queries. My solution was to write stored procedures on the remote servers and then call the stored procedures.

So, instead of doing this:
Select * from linkedserver.database.owner.table
I do this:
Exec linkedserver.database.owner.storedprocedure




Kindest Regards,

DrewTheEngineer

Post #466205
Posted Monday, March 10, 2008 6:01 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Thursday, May 22, 2014 5:29 AM
Points: 40, Visits: 380
The response to this article has been great! I especially liked the ideas of remote joins, openquery , remote stored procedures, and table valued functions.

Almost every DBA shop needs to connect to remote servers and the feed back you have provided will be helpful to many people.

As some of you mentioned, replication would be ideal in this situation, and I agree. At a previous position, transactional replication to reporting and web servers was used to retrieve real-time information, balance the load on our primary servers, and avoid remote connections.

I will keep your responses archived, so that future remote connection projects can be optimized thoroughly.

Thanks,

Bill Richards, MCSE, MCDBA (Author of the Article)
Senior Database Analyst
Post #466596
Posted Monday, March 10, 2008 9:37 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, March 12, 2009 6:45 PM
Points: 2, Visits: 4
Hello all
It is possible to implement a federated database using linked-server? it'll be nice with msql2000 or mssql2005 could implement a strategic as its mentioned in this example (http://www.infoq.com/presentations/shoup-ebay-architectural-principles)
Best Regards
Post #466768
Posted Friday, March 14, 2008 8:49 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Wednesday, January 13, 2010 11:39 AM
Points: 8, Visits: 17
I wonder if anyone has any comments about the use of linked servers to connect different instances running on the same machine? Is the performance substantially better? I would think in this case it would be less critical, since there is no physical network IO?
Post #469442
« Prev Topic | Next Topic »

Add to briefcase ««12

Permissions Expand / Collapse