Linked server throughput/performance

  • What kind of performance could you expect extracting data through a linked server?

    I am extracting large volumes of data from Oracle into SQL Server (~1m rows each day.) which takes about 2 hours. 100,000 rows take about 14mins.

    The data file which contains only this table is ~850MB in size. It's getting to the point where my extraction is failing in oracle with " snapshot is too old" errors.

    So... how does it work, and where is the bottle neck?

    The same select query in Oracle take about 2-3 minutes. Copying/inserting the same data in SQL server takes about the same.. only 2-3 minutes. The PC running SQL Server appears idle little activity on CPU, and nothing on Disk until the last few seconds when the query completes and inserts the data.

    So if selecting the data from oracle isn't the problem, and writing the data isn't the problem, I presume the overhead is something to with transporting the data. Is it something to do with returning the record set as individual rows?

    I have tried breaking the query into smaller parts, and running 2 in parallel, and it seems to have doubled the throughput. This suggests to me some sort of limitation with the protocol/method I am using. (i'm considering executing multiple queries in parallel as a workaround, but this has it's own set of headaches.)

    I know this is difficult question, but can anyone tell me what level of throughput is acheiveable or resonable? Either in rows/minute or MB/minute? i.e "I select 100,000 rows of data equating to 100MB in 10 minutes. vs 100K rows/100MB in 1 minute."

    Are there any parameters or options I can tweak that may affect performance?

    Is there a better alternative? ODBC? JDBC?

    edit: SQL Server is running on a WinXP PC with 2GB ram.

    Linked server uses MSDAORA.

    Oracle Server is 10G on a honking great big SUN box.

  • I'm just tossing this out as a thinking processes as I don't not know what's going on behind the scenes.

    800MB of data should transfer accross a 100mbit conenction in about 1min and 20 seconds (12.5MB/s minus overhead/kerneltime is about 10MB/s). a gigabit should transfer this in little to no time.

    The only two things that come to mind is that doing the select statement requires the SQL server to create the resulting table in memory and 850MB of data on a 2gig machine is gonna slow it down. Or the other idea is that selecting into a table with clustered index/other indexes is slowing down the insert a lot.

    these are just guesses since no one else has an answer yet

  • bcronce (9/30/2008)


    I'm just tossing this out as a thinking processes as I don't not know what's going on behind the scenes.

    800MB of data should transfer accross a 100mbit conenction in about 1min and 20 seconds (12.5MB/s minus overhead/kerneltime is about 10MB/s). a gigabit should transfer this in little to no time.

    The only two things that come to mind is that doing the select statement requires the SQL server to create the resulting table in memory and 850MB of data on a 2gig machine is gonna slow it down. Or the other idea is that selecting into a table with clustered index/other indexes is slowing down the insert a lot.

    these are just guesses since no one else has an answer yet

    Thanks for the reply. The PC seems to be relatively idle. If I reboot the machine, and look at system resources, it has 1.5GB of pyhsical ram free. When I run this run query it's not obviously paging. You can see the insert at the end of the query hitting the disk. But up until then almost nothing is going on.

    The insert to disk is also insert to a truncated table with no constraints or indexes. And if I perform the same copy from table A to table B, it executes in a matter of seconds. Which in theory should be even slower as it would be reading and writing to the same disk.

    Also, yesterday I tried performing selects of ~70,000 records (without the) insert via the link. And I was tweaking the linked server options. The return times where pretty consistent (allowing a small difference for variable and server load.) I get 9:00, 8:43, 8:46, 8:35 minutes, which equates to 130-135 records per second.

    And so the throughput for a single query appears to be consistently around the 130 records per second mark.

    I ran 2 queries of approximately 70K records in parallel => 140K records. Total elapsed time was 540 seconds which equates to ~260 records per second.

    To me the bottleneck seems to be the transfer of records through the databse link. I'm going to start looking into the OLE drivers and see if there are any patches or updates.

    Thanks again for the reply bcronce. Any other pointers or suggestions are welcome.

    Regards

    Pete

  • thepotplants (9/30/2008)


    I'm going to start looking into the OLE drivers and see if there are any patches or updates.

    damnit.. I have the latest version... MDAC 2.8 SP1.

  • I'm still working on this.

    I just tried selecting from a testing environment, and I get three times better throughput from a smaller machine.

    I get 475 records per second vs 135!!!.

    This suggests to me, that the bottleneck is on the oracle end of the hose.

  • Well i've been working on this for several days and learend some interesting things.

    We discovered that the delay is definitely related to the transport of the records. Oracle completes the query in a few seconds, and is constantly waiting for client. We suspect the client is synchronous rather than asychronous.

    After much searching we found this on MSDN:

    http://msdn.microsoft.com/en-us/library/bb332055(SQL.90).aspx#connssis_topic4

    Look at the section about oracle.. Cutting a long story short: MSDAORA hasn't been updated since Oracle 8i. And Microsoft themselves recommend using the Oracle data provider rather than MSDAORA.

    Oracle forums have a section dedicated to OLEDB: http://forums.oracle.com/forums/forum.jspa?forumID=148&start=0

    I'm now working through installtion issues to get the Oracle provider OraOLEDB installed in the hope that is will work better than MSDAORA.

    Will let you know what I find.

  • I spent 12 hours yesterday trygin to install Oracle provider for OLE. I got there in the end. :doze:

    (I had lots of complications caused by having both 9i & 10g clients installed.) :pinch:

    I can happily report that once I got OraOLEDB installed, I can run the same queries side by side (one linked server using MSDAORA and otehr other using OraOLEDB) and the query using the Oracle provider runs more than twice as fast!! 😀

    For anyone else facing this problem, check out these resources:

    http://msdn.microsoft.com/en-us/library/ms190618.aspx

    http://www.oracle.com/technology/tech/windows/ole_db/index.html

    http://www.oracle.com/technology/software/tech/windows/odpnet/index.html

    There is a huge amount of information on the oracle OLE forums:

    http://forums.oracle.com/forums/forum.jspa?forumID=148&start=0

    Good luck.

  • More dramas... :doze:

    After much un-installing and re-installing... I had both MSDAORA and OraOLEDB working side by side... and then it broke again... :crying:

    Started getting:

    Msg 7399, Level 16, State 1, Line 1 The OLE DB provider "MSDAORA" for linked server "LNK_KADM01P" reported an error. The provider ran out of memory.

    Msg 7330, Level 16, State 2, Line 1 Cannot fetch a row from OLE DB provider "MSDAORA" for linked server "LNK_KADM01P".

    OLE DB provider "OraOLEDB.Oracle" for linked server "OLE_LNK_KADM01P" returned message "".

    OLE DB provider "OraOLEDB.Oracle" for linked server "OLE_LNK_KADM01P" returned message "ROW-00001: Cannot allocate memory".

    Msg 7320, Level 16, State 2, Line 1 Cannot execute the query "SELECT * FROM ARBORBP1.CDR_DATA_WORK WHERE MIU_DISP_STATUS =1 and miu_error_code1 = 422" against OLE DB provider "OraOLEDB.Oracle" for linked server "OLE_LNK_KADM01P".

    In the end reinstalling the 9i ODAC components seemed to fix it. it seems I wasn't installing it correctly and only had 9.2.0.1 objects installed instead of 9.2.0.7.

    I also posted on oracle forums here: http://forums.oracle.com/forums/thread.jspa?threadID=667092&tstart=0

    The good news is this now runs even faster than before. :Whistling: (fingies x-ed...)

  • I know this is an old thread, but I was faced with the same problem and found out some useful stuff.

    I had a batch job that ran for 9 minutes.

    By checking the job history I tracked it down to 1 step in the job consuming all but a few seconds.

    This job had one select statement in a cursor that returned 1600 rows, then, inside the loop, a single record linked server select by pk followed by a single record linked server update by pk.

    Revamping the code to use set-based rather than record-based statements wasn't an option for a variety of good reasons if an easier fix could be found.

    I set up a test table with 55,000 records in it in my development environment.

    I built a simple test script that performed two loops 100 times. The first loop ran a simple select statement:

    select td.col_a

    from linkedDB..test.test_data td

    where td.col_pk = @pk_val;

    The second loop had the following update statement in it:

    update linkedDB..test.test_data

    set col_x = 'x'

    where col_pk = @pk_val;

    I started with a linked server provider to Oracle (oraoledb.oracle) with the allow inprocess option checked.

    Time for 100 Select Statements: 185 seconds!

    Time for 100 Update Statements: 347 seconds!

    Frankly, that's pathetically slow!

    I tried changing a number of parameters with no useful results. I either re-started the database services or rebooted the machine, as appropriate, between each test as some of the settings didn't take effect without doing so.

    Linked provider settings I tested that provided unworkable or slower output:

    index as access path

    lazy schema validation.

    ODBC property settings I tested that provided slower output:

    enable statement caching.

    One provider property gave a significant performance boost, but still left things way too slow. Checking the non-transacted updates gave this set of results:

    Time for 100 Select Statements: 137 seconds!

    Time for 100 Update Statements: 294 seconds!

    That's a 20 to 25% improvement but still too darn slow.

    But a 20-25% improvement, across the board, with no code changes isn't anything to sneeze at, either. 🙂

    I have not yet tested whether changing this setting will break something else, or what kind of code it would tend to break.

    I then rewrote the select and the update statement using the openquery function.

    See http://msdn.microsoft.com/en-us/library/ms188427.aspx?ppud=4

    Time for 100 Select Statements: 1 second

    Time for 100 Update Statements: 1 second

    That's more like what I expected to begin with...

    I'm sure that there are some gotcha's to openquery that will come with more complicated statements, but that's a matter for additional testing.

    Hope this helps out!

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply