Developer making me look ignorant

  • It's the DBA's vs the Developers, again.

    No, not really. But I was asked a good question by one of our developers and I could use some help with it.

    The issue is how data is handled during the processing of a linked server query. The subject query will join eleven tables, with most of them on the "linked" server. The tables are wide, but only a few columns will actually be utilized in the query. The challenge is to design the process so that the minimum of data will cross the pipe. The developer asked me whether the SQL engine on the remote server would do some of the work and only send filtered results back to the originating server, or whether the entire table would need to be accessed and filtered on the home server.

    Good question. And if it turns out that the remote SQL engine won't do any work, can we employ it more effectively by creating views on the remote tables that will pre-extract just the columns we are interested in?

    Thanks for your help,

    Elliott

  • Start by looking at the execution plan to see exactly what it is doing.

    However, it is likely that it is pulling back too much data, so consider doing a call to a remote stored procedure to optimize the remote processing and minimize the traffic.

  • Hi Elliot,

    I think this may help:

    http://www.sql-server-performance.com/tips/linked_server_p1.aspx

    There are a lot of caveats when running a distributed query you have a very wise developer on your team to ask this question before the performance problem occurs 🙂

  • Thanks for both replies. The query hasn't been written yet, so we haven't been able to use the execution plan, but it appears clear that we should create some processes on the remote server to do some of the processing/filtering there.

    And yes, I am very lucky to have a great relationship with Development. Sometimes when I say "no, there's a better way", they actually listen!

    Cheers,

    Elliott

  • Elliott Berkihiser (3/23/2011)


    The developer asked me whether the SQL engine on the remote server would do some of the work and only send filtered results back to the originating server, or whether the entire table would need to be accessed and filtered on the home server.

    It can, but in most cases it won't.

    Good question. And if it turns out that the remote SQL engine won't do any work, can we employ it more effectively by creating views on the remote tables that will pre-extract just the columns we are interested in?

    No, views (unless indexed) are just cached select statements and (afaik for remote) get inlined during compile.

    What I normally recommend for this kind of scenario is OPENQUERY rather than 4-part naming. Take the piece that should be remote and send it to the remote server, don't let the servers decide.

    p.s. Go Developers! 😀

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I'm with Gail. Most of the time OPENQUERY puts the processing on the remote machine. You can get some filtering done there and then only bring back the data you need (Go Marketers).

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Gail, Grant

    Thanks.

    I understand how OPENQUERY directs the processing to the remote machine, and I'll look at that.

    I was curious about the situation with Views. Since the main filtering we want is to select only a few columns, but most of the rows, wouldn't a view on the remote machine have the effect of limiting the amount of data sent back to the originating server?

    (Gail, if you've already answered that, I apologize, but I didn't grasp it)

    Elliott

  • Elliott Berkihiser (3/23/2011)


    Gail, Grant

    Thanks.

    I understand how OPENQUERY directs the processing to the remote machine, and I'll look at that.

    I was curious about the situation with Views. Since the main filtering we want is to select only a few columns, but most of the rows, wouldn't a view on the remote machine have the effect of limiting the amount of data sent back to the originating server?

    (Gail, if you've already answered that, I apologize, but I didn't grasp it)

    Elliott

    The view won't make that any different than simply defining those columns in the SELECT criteria. Views don't store data. They just store a query, which retrieves data pretty much as any other query. When you query a view, you're just querying a query, no filtering is done except as you define it.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • OPENQUERY is your friend. It guarantees the query will be executed on the remote server.

    With 11 tables, most of which are on the remote server, you have your hands full that's for sure. The only complaint I have with OPENQUERY is that it will not accept a variable for the query parameter.

    Here are a couple more options for you to consider.

    1. Use a remote stored procedure:

    If enough of the work can be done on the remote server without having access to the local data, yet still reduce the dataset to a size you are comfortable transmitting over the pipe, then you could explore creating a stored procedure on the remote server. You could then execute the remote procedure using four-part naming and pipe the results into a local temp table.

    IF OBJECT_ID(N'tempdb..#tmp_table') > 0

    DROP TABLE #tmp_table

    GO

    CREATE TABLE #tmp_table (id INT, name VARCHAR(100)) ;

    GO

    INSERT INTO #tmp_table

    EXEC [.\SQLEXPRESS_2008].test.dbo.test_remote_proc

    SELECT *

    FROM #tmp_table

    go

    2. Use EXEC...AT:

    Another technique I have found useful is to use the EXEC...AT expression because it let's you use a variable and you can still pipe the results into a local temp table:

    IF OBJECT_ID(N'tempdb..#tmp_table') > 0

    DROP TABLE #tmp_table

    GO

    CREATE TABLE #tmp_table (id INT, name VARCHAR(100)) ;

    GO

    DECLARE @query NVARCHAR(MAX) = N'SELECT 1 AS id, ''my name'' as name;' ;

    INSERT INTO #tmp_table

    (

    id,

    name

    )

    EXEC (

    @query

    ) AT [LINKED_SERVER_NAME] ;

    SELECT *

    FROM #tmp_table ;

    GO

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • opc.three (3/23/2011)


    2. Use EXEC...AT:

    Another technique I have found useful is to use the EXEC...AT expression because it let's you use a variable and you can still pipe the results into a local temp table:

    ... ooooh, that's neat. I hadn't seen that one before.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Elliott Berkihiser (3/23/2011)


    I was curious about the situation with Views. Since the main filtering we want is to select only a few columns, but most of the rows, wouldn't a view on the remote machine have the effect of limiting the amount of data sent back to the originating server?

    By the time the execution starts, there's no mention of the view left. Let's look at an example just with one server. Assume for a moment that there are two tables, T1 and T2, each with 10 columns. A view is defined

    CREATE View V1

    AS

    SELECT t1.c1, t1.c2, t2.c4, t2.c5, t2.c6

    FROM t1 inner join t2 on t1.c1 = t2.c2

    WHERE t2.c10 is not null

    Then a query is executed against that view

    SELECT * FROM V1 WHERE c6 > 0

    During parsing, the name of the view is replaced by the definition

    SELECT * FROM (

    SELECT t1.c1, t1.c2, t2.c4, t2.c5, t2.c6

    FROM t1 inner join t2 on t1.c1 = t2.c2

    WHERE t2.c10 is not null

    ) V1 WHERE c6 > 0

    That query is what is then optimised and executed.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Craig Farrell (3/23/2011)


    opc.three (3/23/2011)


    2. Use EXEC...AT:

    Another technique I have found useful is to use the EXEC...AT expression because it let's you use a variable and you can still pipe the results into a local temp table:

    ... ooooh, that's neat. I hadn't seen that one before.

    Neat indeed. I've learned something... 🙂

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply