Derived table in linked server query behaving as a subquery -- 1 execution per row

  • I have a stored procedure that builds out a table for reporting purposes. The procedure runs every 5 minutes to completely rebuild the table, as it needs to be 'nearly' real time data. The total rows are usually ~6,000, and because it is a bit complicated to pull all the information together it takes about 5 seconds or so to complete. The snip below is the final part, where we refer to our warehouse management system to identify any orders that have not been shipped so we can delete those rows.

    The portion in question is the derived table:

    (

    select orderid COLLATE SQL_Latin1_General_CP1_CI_AS AS Orderid

    ,custid COLLATE SQL_Latin1_General_CP1_CI_AS AS Custid

    ,shipdate

    FROM [LINKEDSERVER].wms.wms.orders so

    )

    This is a straightforward derived table, that returns about 30K rows. It is narrow and fairly concise, given the nature of things. It runs sub-second, and when run locally on its own server it results in about 10K reads. I expect this to behave as a 'table' (derived table) that the outer query can join to as we normally expect.

    The overall statistics for the entire query below is 172 logical reads and it takes about 1 second. The logical reads is what I would expect, as there are ~6K rows that need to be scanned.

    DELETE dbo.ReportingOrders

    WHERE orderid IN (

    SELECT t.orderid

    FROM

    (

    SELECT 'HOP'+REPLICATE('0', 8 - LEN(orderid)) + CAST(orderid AS varchar) as 'NumberFull'

    ,scheduleddate_shipping

    ,custnmbr

    ,ORDERID

    FROM [dbo].[ReportingOrders]

    WHERE OrderType IN ('Not Shipped, No Shipment','Not Shipped, With Shipment')

    ) t

    JOIN

    (

    select orderid COLLATE SQL_Latin1_General_CP1_CI_AS AS Orderid

    ,custid COLLATE SQL_Latin1_General_CP1_CI_AS AS Custid

    ,shipdate

    FROM [LINKEDSERVER].wms.wms.orders so

    )

    s ON t.custnmbr=s.custid AND t.numberfull=s.orderid

    AND t.scheduleddate_shipping=s.shipdate

    )

    HOWEVER, our database monitoring software has singled this out as a VERY resource intensive query. We did a trace, and it appears that the derived table is in fact being executed 1x for each of the 6K rows, returning a total of 220,000,000 rows !!!!

    To test if this is actually the case, I rewrote the query as follows:

    DECLARE @d DATETIME

    SET @d = DATEADD(dd,-60,GETDATE())

    select orderid COLLATE SQL_Latin1_General_CP1_CI_AS AS Orderid

    ,custid COLLATE SQL_Latin1_General_CP1_CI_AS AS Custid

    ,shipdate

    INTO #b

    FROM [LINKEDSERVER].wms.wms.orders so

    WHERE SHIPDATE>@d

    DELETE dbo.ReportingOrders

    WHERE orderid IN (

    SELECT t.orderid--,s.*

    FROM

    (

    SELECT 'HOP'+REPLICATE('0', 8 - LEN(orderid)) + CAST(orderid AS varchar) as 'NumberFull'

    ,scheduleddate_shipping

    ,custnmbr

    ,ORDERID

    FROM [dbo].[ReportingOrders]

    WHERE OrderType IN ('Not Shipped, No Shipment','Not Shipped, With Shipment')

    ) t

    JOIN

    #b

    /*(

    select orderid COLLATE SQL_Latin1_General_CP1_CI_AS AS Orderid

    ,custid COLLATE SQL_Latin1_General_CP1_CI_AS AS Custid

    ,shipdate

    FROM [LINKEDSERVER].wms.wms.orders so

    )*/

    s ON t.custnmbr=s.custid AND t.numberfull=s.orderid

    AND t.scheduleddate_shipping=s.shipdate

    )

    DROP TABLE #b

    Sure enough, this resulted in the query completely disappearing from the list of 'resource intensive queries', and of course it also runs faster.

    My takeaway from this is that derived tables in queries that use linked servers are handled COMPLETELY differently by the optimizer. We tend to use this quite often, so it will mean a lot of code review. I also wonder if linked server queries in general simply do not operate by normal sql server rules. Does anyone have any experience with issues like this?

    TIA d lewis

Viewing 0 posts

You must be logged in to reply to this topic. Login to reply