Procedures involving two servers

  • This is a problem that I've been bashing my head against the wall for the past few days on, so I figure I'll post here and try to get some ideas.

    Basic scenario is this - I have a stored procedure on server A, of which half the work is accessing data on server A, and the other half of the work is accessing data on server B.

    This worked fine initially, when the amount of data was relatively small. However, as time has gone on, the amount of time it takes has been increasing exponentially. What started off as being around 5 minutes is now taking well over 3 hours to finish. This obviously is something I need to address.

    In order to do so, I brainstormed a bit, and decided that the solution would be to have the stored procedure do the work accessing the data on server A, and then invoke a procedure on server B, which would do the work on server B and then return the results back to the procedure. The procedure would then compile the two results together.

    So, I crafted the code, and then attempted it. No luck. The server returned an error because I was attempting to store the result set of a remote procedure call into a temp table. In order to do this, I would need to enable the distributed transaction coordinator service, amongst other things, which is something that I didn't want to do.

    I searched around, and found that another solution may be to use XML instead. So I did some research, and implemented a solution using XML. After a few hours of messing around, I tried that solution, and, once again, no luck. This time I found out that XML data types are not supported in remote procedure calls.

    I checked up this error, and found someone who posted the same problem, and said that the solution was to instead use a VARCHAR data type, and then just interpret it as XML. I tested this, and, great! It worked! Except ... when I attempted to implement it into my solution, I got yet another error. This time, it was because VARCHAR(MAX) data types are not supported by remote procedure calls, and the size of the VARCHAR field would be well over the 8000 supported by non-VARCHAR(MAX) versions.

    So now I'm out of ideas. The last solution which I have is very ugly, but if I can't find any others, I'll have to employ it. Basically what I'll do is, create a table on the remote server, and then call the procedure, which will store the results in that table. Then I'll access that table from the server with the stored procedure. This *should* be faster, since it's only doing a table access, instead of a large query.

    If anyone else has any better ideas, I'd love to hear em.

  • Would it be possible to split the work?

    Server A calls a proc on Server B that gets/creates a smaller subset of data in an actual table for this data only. Note: guessing this is a batch process.

    (possible to call Server B with a comma delimited varchar or target data?)

    After return Server A would then perform work from this subset?

  • What triggers this procedure/process? Does it need to be run on user demand or can it be done as a periodic scheduled job?

    Seth Phelabaum


    Consistency is only a virtue if you're not a screwup. 😉

    Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

  • Daryl-273852 (3/5/2010)


    Would it be possible to split the work?

    Server A calls a proc on Server B that gets/creates a smaller subset of data in an actual table for this data only. Note: guessing this is a batch process.

    (possible to call Server B with a comma delimited varchar or target data?)

    After return Server A would then perform work from this subset?

    That was my last-resort idea. I'd rather not have to resort to that, but if there are no better solutions that is the one I will employ.

    Garadin - it is indeed a scheduled job.

  • The amount of data (both starting and the amount of data you need for your 'result' post processing) is a big factor here, but there's also the concept of doing this all with scheduled jobs that break this into steps that might be able to flow control it better.

    Also, the processing that occurs on A/B can matter. Are the processes mostly independent of eachother for different data and then combined at the end, or is there back and forth, or what? Is the goal to split the processing onto 2 servers or is it just that the data is on 2 different servers?

    I'm thinking that most of your efficiency savings here are going to revolve around doing the least amount of data transfer between servers.

    Seth Phelabaum


    Consistency is only a virtue if you're not a screwup. 😉

    Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

  • What is essentially happening is, the stored procedure is creating a set of ID numbers from some data. Then, from those ID numbers, some data from server A is acquired, and some data from server B is acquired. This data is then combined together, and then placed into a table.

    What my initial idea was, was to send those ID numbers to server B, have it process the data, and then send the data back. This all works perfectly, *except* for the part where the data is being sent back. For some reason, sending data back and then trying to make use of that data, is virtually impossible. Any solution which I could attempt to employ, has some problem with it.

    Which is why I'm thinking the only solution is to store the result data into a table on server B, and just query that table instead.

  • How hard would it be to break these steps into agent job steps? I could be completely off base here, but usually when I see problems like this it is because the optimizer is doing something you're not expecting it to. That said, I don't have a ton of experience with it, so that 'usually' is like the two times that I've seen it :-).

    Something like the following:

    Server A Job 1 Step 1: Truncate Table on Server B

    Server A Job 1 Step 2: send those ID numbers to server B (Insert data into table on Server B)

    Server A Job 1 Step 3: call Server B Job 1. (EXEC ServerB.MSDB.dbo.sp_start_job @Job_Name = 'ServerBJob1')

    Server B Job 1 Step 1: have it process the data,

    Server B Job 1 Step 2: Truncate Table on Server A

    Server B Job 1 Step 3: and then send the data back. (Insert data into table on Server A)

    Server B Job 1 Step 4: Call Server A Job 2. (EXEC ServerA.MSDB.dbo.sp_start_job @Job_Name = 'ServerAJob2')

    Server A Job 2 Step 1: make use of that data

    The goal here is just to make sure that the only time data crosses servers is when you specifically want it to. Maybe that's not the issue at all though.

    Seth Phelabaum


    Consistency is only a virtue if you're not a screwup. 😉

    Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

  • Yeah that solution is also pretty much the same as the last-resort one. Only difference is that instead of writing the data on server B, you're proposing to write it on server A.

    Ideally I want to avoid the necessity of writing data to a table for the express purpose of having it be read - it seems to me like it should be possible to transfer that data across the network instead. But, in order to do so, you need to do some crazy things, so I guess that it is the only solution.

    I think in SQL Server 2008, I would be able to pass the table as a variable for the stored procedure, but alas, this does not exist in SQL Server 2005.

  • So, as a follow-up, I implemented the "worst-case" solution I had described above, and it turned the process from one that could take as much as 10 hours, down to around 5 minutes. Not bad!

    Though, if anyone has a solution that doesn't involve using a table to store the data, and that actually allows you to seamlessly communicate cross-server, I'd love to know it.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply