October 28, 2015 at 9:09 am
I have a stored procedure that builds out a table for reporting purposes. The procedure runs every 5 minutes to completely rebuild the table, as it needs to be 'nearly' real time data. The total rows are usually ~6,000, and because it is a bit complicated to pull all the information together it takes about 5 seconds or so to complete. The snip below is the final part, where we refer to our warehouse management system to identify any orders that have not been shipped so we can delete those rows.
The portion in question is the derived table:
(
select orderid COLLATE SQL_Latin1_General_CP1_CI_AS AS Orderid
,custid COLLATE SQL_Latin1_General_CP1_CI_AS AS Custid
,shipdate
FROM [LINKEDSERVER].wms.wms.orders so
)
This is a straightforward derived table, that returns about 30K rows. It is narrow and fairly concise, given the nature of things. It runs sub-second, and when run locally on its own server it results in about 10K reads. I expect this to behave as a 'table' (derived table) that the outer query can join to as we normally expect.
The overall statistics for the entire query below is 172 logical reads and it takes about 1 second. The logical reads is what I would expect, as there are ~6K rows that need to be scanned.
DELETE dbo.ReportingOrders
WHERE orderid IN (
SELECT t.orderid
FROM
(
SELECT 'HOP'+REPLICATE('0', 8 - LEN(orderid)) + CAST(orderid AS varchar) as 'NumberFull'
,scheduleddate_shipping
,custnmbr
,ORDERID
FROM [dbo].[ReportingOrders]
WHERE OrderType IN ('Not Shipped, No Shipment','Not Shipped, With Shipment')
) t
JOIN
(
select orderid COLLATE SQL_Latin1_General_CP1_CI_AS AS Orderid
,custid COLLATE SQL_Latin1_General_CP1_CI_AS AS Custid
,shipdate
FROM [LINKEDSERVER].wms.wms.orders so
)
s ON t.custnmbr=s.custid AND t.numberfull=s.orderid
AND t.scheduleddate_shipping=s.shipdate
)
HOWEVER, our database monitoring software has singled this out as a VERY resource intensive query. We did a trace, and it appears that the derived table is in fact being executed 1x for each of the 6K rows, returning a total of 220,000,000 rows !!!!
To test if this is actually the case, I rewrote the query as follows:
DECLARE @d DATETIME
SET @d = DATEADD(dd,-60,GETDATE())
select orderid COLLATE SQL_Latin1_General_CP1_CI_AS AS Orderid
,custid COLLATE SQL_Latin1_General_CP1_CI_AS AS Custid
,shipdate
INTO #b
FROM [LINKEDSERVER].wms.wms.orders so
WHERE SHIPDATE>@d
DELETE dbo.ReportingOrders
WHERE orderid IN (
SELECT t.orderid--,s.*
FROM
(
SELECT 'HOP'+REPLICATE('0', 8 - LEN(orderid)) + CAST(orderid AS varchar) as 'NumberFull'
,scheduleddate_shipping
,custnmbr
,ORDERID
FROM [dbo].[ReportingOrders]
WHERE OrderType IN ('Not Shipped, No Shipment','Not Shipped, With Shipment')
) t
JOIN
#b
/*(
select orderid COLLATE SQL_Latin1_General_CP1_CI_AS AS Orderid
,custid COLLATE SQL_Latin1_General_CP1_CI_AS AS Custid
,shipdate
FROM [LINKEDSERVER].wms.wms.orders so
)*/
s ON t.custnmbr=s.custid AND t.numberfull=s.orderid
AND t.scheduleddate_shipping=s.shipdate
)
DROP TABLE #b
Sure enough, this resulted in the query completely disappearing from the list of 'resource intensive queries', and of course it also runs faster.
My takeaway from this is that derived tables in queries that use linked servers are handled COMPLETELY differently by the optimizer. We tend to use this quite often, so it will mean a lot of code review. I also wonder if linked server queries in general simply do not operate by normal sql server rules. Does anyone have any experience with issues like this?
TIA d lewis
Viewing 0 posts
You must be logged in to reply to this topic. Login to reply