September 20, 2013 at 5:30 pm
Hi Brandie,
Your first two updates need the "Extract" column, which is not covered by the filtered indexes, so a decision has to be made whether to index seek and key lookup or ci scan - that will depend a lot on the row estimates - and you can see that update #2 with many more rows in the estimate has used the CI scan to avoid the key lookups. I would personally look at performing both updates in one query to see how that fared...
Query 3 looks fine because it doesn't need the Extract column (and thankfully SQL knows this and has ignored the SELECT on it), however it does have a relatively high cost for updating the index idx_staging_unid, which I personally would drop like a hot brick as that index is not being useful to you, it is just being scanned once in query 4 - which is a killer query!
Query 4 is RBAR on acid! Triangular join from hell with over 500 billion rows in one section. The comment says you are linking "3" records to "2" records, but there is no filter on recordid = 2 in the cross apply - is this correct? can it be done another way? if the "3" record just needs to be linked to the "2" record that comes immediately before it, I suspect a simple join on ID-1 would work better? don't know the whole story though...
that's enough for now 🙂
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
September 24, 2013 at 5:55 am
wolfkillj (9/20/2013)
In just a quick review of your DDL, I didn't see an index that would cover this query. FIDX_StagingRecID2 comes close but it doesn't include the Extract_RI column.
*headdesk* Of course I missed that one.
But it doesn't explain why the second update (which does much the same thing on different columns) is able to use its index. It does an Index Scan as opposed to a Key Lookup.
That's why I'm confused.
September 24, 2013 at 6:00 am
mister.magoo (9/20/2013)
Query 4 is RBAR on acid!
Tell me about it.
Triangular join from hell with over 500 billion rows in one section. The comment says you are linking "3" records to "2" records, but there is no filter on recordid = 2 in the cross apply - is this correct? can it be done another way?
Sure. It can be done the way the coworker originally did it which meant it took 2+ days to process. @=)
if the "3" record just needs to be linked to the "2" record that comes immediately before it, I suspect a simple join on ID-1 would work better? don't know the whole story though...
The problem being is that the only way the 3 records are "joined" to the 2 records in the extract is by physical order. There is no other way to identify which 3 records belong to a 2 record. Hence, yes, the RBAR query from hell. The CROSS APPLY and indexes make it work so much faster than it did before by like a million times. We're now processing these files in 15 minutes or less, depending on the size of the import file.
Sometimes RBAR is necessary no matter how much one wants a set based solution. The only other option is to get our extract provider to supply that connecting key, but that takes months to request, then we get charged for it, then it has to be tested... And per the boss we need a solution now, not next year. Hence... RBAR on acid.
September 24, 2013 at 6:07 am
GAH. I keep forgetting that I can't create an index on the Extract column because the column is too big.
I guess I'll have to live with the Key Lookup after all.
September 24, 2013 at 6:54 am
Brandie Tarvin (9/24/2013)
GAH. I keep forgetting that I can't create an index on the Extract column because the column is too big.I guess I'll have to live with the Key Lookup after all.
Well, you can't use the Extract column (a varchar(2000) column, right?) as a key column (because it exceeds the 900B key maximum key width, of course), but you should be able to make it an included column, or am I overlooking something? I didn't see any place in your code (in either this thread or the original thread from May) where you would benefit from having the Extract column as part of a key, or am I again overlooking something?
Jason Wolfkill
September 24, 2013 at 7:19 am
Hrmm... You know, you're right. I can't think of any reason why it wouldn't work in the INCLUDE list...
Thanks.
September 24, 2013 at 8:03 am
Brandie Tarvin (9/24/2013)
But it doesn't explain why the second update (which does much the same thing on different columns) is able to use its index. It does an Index Scan as opposed to a Key Lookup.
See if it helps http://www.brentozar.com/archive/2013/05/indexing-wide-keys-in-sql-server/
-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done;-)
September 24, 2013 at 8:24 am
Bhuvnesh (9/24/2013)
Brandie Tarvin (9/24/2013)
But it doesn't explain why the second update (which does much the same thing on different columns) is able to use its index. It does an Index Scan as opposed to a Key Lookup.See if it helps http://www.brentozar.com/archive/2013/05/indexing-wide-keys-in-sql-server/
That's an interesting link. I'm going to have to think about how it could work given that my query doesn't use the column in the WHERE clause.
September 24, 2013 at 8:38 am
Brandie Tarvin (9/24/2013)
That's an interesting link. I'm going to have to think about how it could work given that my query doesn't use the column in the WHERE clause.
Beside this how much records in your UPDATE query..i cant open your exec plan ... so tell me count
-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done;-)
September 24, 2013 at 8:40 am
Brandie Tarvin (9/24/2013)
wolfkillj (9/20/2013)
In just a quick review of your DDL, I didn't see an index that would cover this query. FIDX_StagingRecID2 comes close but it doesn't include the Extract_RI column.*headdesk* Of course I missed that one.
But it doesn't explain why the second update (which does much the same thing on different columns) is able to use its index. It does an Index Scan as opposed to a Key Lookup.
That's why I'm confused.
I assume that "second update" means the second update query shown in the execution plan you posted?
I suspect that the optimizer chose a Clustered Index Scan in that case because the statistics showed that the predicate RecordID = 3 would yield an estimated 634,769 rows. I'm fairly certain that's WAY over the tipping point where an index seek plus key lookup becomes too expensive. The actual number of rows was 658,658, so the optimizer probably chose correctly.
EDIT: Plus, the cardinality of that table is 662,572, so you're getting almost all the rows anyway.
Compare that to the first query that uses an index seek and key lookup - estimated rows coming out of the index seek was 3,910 and actual rows was 3,910. Here, the optimizer decided it's cheaper to do the index seek and key lookup for 3,910 rows than to scan the clustered index.
Jason Wolfkill
Viewing 10 posts - 16 through 25 (of 25 total)
You must be logged in to reply to this topic. Login to reply