• Eric M Russell (3/3/2011)


    My most recent involvement with querying remote data involves joining huge resultsets originating in a 3rd party Oracle 11g database with reference data in a SQL Server 2008 datamart, and then inserting the result into into the same SQL Server datamart for later use by the reporting team. The SQL Server DTC and Oracle don't share much, and early experiments with 4 part named queries would result in entire table scans being pulled across the wire. When I EXEC a pass-though query to Oracle, I'm only getting a small DTC overhead, and the remaining 1 - 30 minutes of processing time occurs entirely on Oracle, and then however much time it takes to bring the final resultset back across the wire. Joining the temporary resultset with the reference tables is trivial at that point.

    Yes, I've been in a similar boat and feel your pain. Distributed query is not perfect by any means, though some of the problems can be traced to the remote end which may not provide the optimizer will good (or any) statistical information about the remote data. No doubt this is especially problematic where the remote server is not a SQL Server (it's been a few years since I had to work with Oracle). Do look into Gus' suggestion re: table variables to avoid a distributed transaction, or if 2008, try the remote transaction promotion option I described: it will prevent the local INSERT implicit transaction from being promoted to a full DTC transaction.