Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Odd performance query behavior on very large 40 TB partitioned table in distributed view, maybe it's tipping? Expand / Collapse
Author
Message
Posted Tuesday, July 16, 2013 10:09 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Thursday, October 10, 2013 7:39 PM
Points: 42, Visits: 122
First a bit of background. I recently started a job with a somewhat complicated setup for a 50 TB database. There is a 3 node failover cluster with two named SQL Server 2012 Enterprise instances with a linked server established from the first to the second instance. There are three very large tables in this database, and one is super-extra-large, and accounts for 40 of those TB. Of course I'm having issues with that very same large table, for "Probe" data.

The data has been sharded by year and month into separate databases, i.e. 2013JUL, 2013JUN, 2013MAY... 2012JAN and there is a main database MAINDB where the large tables have been removed, and instead a distributed view created which does a UNION ALL across the individual database Probe tables, including those on the linked server (data older than 2012 is on this second instance). This approach was taken in order to be as transparent as possible to the leading application. There is a check constraint on each individual probe table to ensure the Probe view in MAINDB allows inserts.

The Probe table is also partitioned in weeks on the clustering key. The clustering key is a datetime2 column named "reportStamp" in order to support date range based queries, and the primary key is a composite with reportStamp and a bigint Id column.

There is a single non-clustered index on ProbeData table with the following definition:

CREATE NONCLUSTERED INDEX [IX1_tblDWProbe_vehId_repSt_paramId_paramValue] ON [dbo].[tblDWProbe]
(
[vehicleId] ASC,
[reportStamp] ASC,
[parameterId] ASC,
[paramValue] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90)
GO

We are having performance issues on a query that is returning the most recent probe parameter value (column paramValue) within the last 7 days. To date the query has been invoked ad hoc from an SSIS package, and my goal is to wrap this in a stored proc.

We should know vehicleId, parameterId, and a reportStamp range, and should be able to fully leverage this index as we are attempting to return paramValue.

What we are seeing is sometimes very good performance and takes less than a minute, and other times it takes many minutes, and I have been told as much as a day. The few times I have "caught it in the act" I see the logical and physical read counts in the several millions, so it sorta feels like this query is tipping. I believe the above index should be covering, and my understanding is a covering non-clustered index won't tip. Maybe there is a nuance where paramValue needs to be an included column and is being ignored even though it's a key column? Maybe a parameter sniffing issue?

Here the query pulled out into a simple testing harness.

USE TEST_MAINDB
GO

DECLARE @BeginDate DATETIME2;
DECLARE @EndDate DATETIME2;
DECLARE @VehicleId INT;
DECLARE @ParameterId INT;

SET @BeginDate = DATEADD(d, -1, CURRENT_TIMESTAMP);
SET @EndDate = CURRENT_TIMESTAMP;
SET @VehicleId = 9940;
SET @ParameterId = 40024;

SELECT TOP 1
COALESCE( pr.[reportStamp], '-') AS [reportStamp]
,COALESCE( p.[parameterDesc], '-') AS [parameterDesc]
,COALESCE( pr.[paramValue], 999999) AS [paramValue]
FROM [dbo].[tblDWProbe] pr WITH (NOLOCK, FORCESEEK)
INNER JOIN [dbo].[tblParameters] p ON p.[parameterId] = pr.[parameterId]
WHERE pr.[vehicleId] = @VehicleId
AND pr.[reportStamp] BETWEEN @BeginDate AND @EndDate
AND pr.[parameterId] = @ParameterId
ORDER BY pr.[reportStamp] DESC
OPTION (RECOMPILE)

There is a great deal of contention from the leading application always doing inserts, thus the NOLOCK hint. We don't have the disk iron for me to be comfortable enabling snapshot isolation. In my limited testing I have observed too much TempDB contention.

I have tried adding OPTION (RECOMPILE) and observe two very different query plans. Also resorted to the FORCESEEK hint. I've tried to attach the XML to this post. Plan1.sqlpan is very complex, and Plan2.sqlplan appears relative simple.

As stated before the requirement is to go back 7 days, and I have tried in stages to start with 1 day, then 2, then 3 and observe performance and plans.

Sometimes I also get a missing index very similar to the above existing index, but with different key order and paramValue is an included column.

CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>] ON [dbo].[tblDWProbe]
(
[parameterId],
[vehicleId],
[reportStamp])
INCLUDE ([paramValue])

If the optimizer is not leveraging the non-clustered index, that would explain the disparity in performance. I haven't yet spun up profiler or gotten brave enough with extended events yet to see if this is the case. I think the approach before the Probe table was sharded and the distributed view came into the picture was to use explicit INDEX query hints, which has been now been negated by the view.

Any insights or pointers into how I should troubleshoot this critter are greatly appreciated!

Thanks!


  Post Attachments 
Plan1.sqlplan (3 views, 842.83 KB)
Plan2.sqlplan (2 views, 35.10 KB)
Plan2.jpg (3 views, 40.33 KB)
Plan1a.jpg (2 views, 77.11 KB)
Plan1b.jpg (1 view, 120.26 KB)
Plan1c.jpg (0 views, 111.89 KB)
Post #1474207
Posted Tuesday, July 16, 2013 10:41 AM


SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Today @ 1:45 PM
Points: 4,128, Visits: 5,837
I would LOVE to help you with this, but it goes WAY beyond the primary purpose of a forum post - that being short, targeted, straight-forward assistance. What you have isn't any of those.

My best advice is to get a really sharp performance tuning professional on board for a review. There are some good ones on this forum and elsewhere. Perhaps someone else will decide to jump in on their free time and try to help you out with this one. Best of luck!


Best,

Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru at GMail
Post #1474224
Posted Tuesday, July 16, 2013 10:58 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Thursday, October 10, 2013 7:39 PM
Points: 42, Visits: 122
I appreciate the response, and yes I don't expect someone to volunteer for a remote DBA session and solve my problem for free. I just tried to flesh out a full background context and asked for some general pointers on approaches to take to solve this problem.

I was hoping for insights on some general approaches like "I would watch DMV 'abc' or you really need to boot up a profiler trace and watch 'xyz'." I know I've seen helpful general hints like this on this forum before. I'm trying to understand reasons why the optimizer will choose a completely different plan or disregard an index. Yes I know the stock answer is always "it depends", but that list of things it depends on is not infinite. Isn't the point of this forum to encourage learning and foster understanding?
Post #1474231
Posted Tuesday, July 16, 2013 1:06 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Thursday, October 10, 2013 7:39 PM
Points: 42, Visits: 122
UPDATE: For anyone that may encounter a similar issue.

I've dug a bit deeper, and I think what is happening is that sometimes the generated query plan undergoes partition elimination at compile time versus at runtime. When at runtime, I have verified that the number of executions is zero for tables with check constraints outside my range.

I still don't quite understand why the non-clustered index is sometimes ignored when it should be covering, but I have had the most luck with the FORCESEEK query hint.
Post #1474261
Posted Tuesday, July 16, 2013 9:01 PM


SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Today @ 1:45 PM
Points: 4,128, Visits: 5,837
jallmond (7/16/2013)
UPDATE: For anyone that may encounter a similar issue.

I've dug a bit deeper, and I think what is happening is that sometimes the generated query plan undergoes partition elimination at compile time versus at runtime. When at runtime, I have verified that the number of executions is zero for tables with check constraints outside my range.

I still don't quite understand why the non-clustered index is sometimes ignored when it should be covering, but I have had the most luck with the FORCESEEK query hint.


Thanks for posting the follow-up. Lots of posters don't realize that this makes the forum MUCH more valuable as people searching can find problems AND solutions!

I think another of the root causes is that DPVs just never got any real loving from the product team and haven't had much attention in many years I don't think.


Best,

Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru at GMail
Post #1474355
Posted Wednesday, July 17, 2013 5:44 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 8:05 AM
Points: 14,802, Visits: 27,276
I don't know that this will help much, but, I'd look at the properties on the SELECT operator. Is the optimizer completing optimization or is it timing out? If the latter, you may just be seeing plan instability. In that case, I think you're already on the right track with a few query hints to move the optimizer in the right direction. If the former... I'm not sure what's going on.

----------------------------------------------------
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood..." Theodore Roosevelt
The Scary DBA
Author of: SQL Server 2012 Query Performance Tuning
SQL Server 2008 Query Performance Tuning Distilled
and
SQL Server Execution Plans

Product Evangelist for Red Gate Software
Post #1474509
Posted Wednesday, July 17, 2013 6:13 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Thursday, October 10, 2013 7:39 PM
Points: 42, Visits: 122
The optimizer is completing with FULL optimization, and the plan hashes match on subsequent runs.

I think the FORCESEEK hint is the what got me over the hump and is of the most benefit in this case.

For some reason I see inconsistent behavior with regard to whether it chooses to use the non-clustered index when using a simple INNER JOIN. With CROSS APPLY it seems always use the index. Weird.
Post #1474529
Posted Wednesday, July 17, 2013 6:16 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Thursday, October 10, 2013 7:39 PM
Points: 42, Visits: 122
TheSQLGuru (7/16/2013)


I think another of the root causes is that DPVs just never got any real loving from the product team and haven't had much attention in many years I don't think.


Surely the lack of real love for DPVs has nothing to do with wanting to sell more Enterprise licenses for table partitioning.
Post #1474532
Posted Wednesday, July 17, 2013 11:39 AM


SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Today @ 1:45 PM
Points: 4,128, Visits: 5,837
jallmond (7/17/2013)
TheSQLGuru (7/16/2013)


I think another of the root causes is that DPVs just never got any real loving from the product team and haven't had much attention in many years I don't think.


Surely the lack of real love for DPVs has nothing to do with wanting to sell more Enterprise licenses for table partitioning.


DING-DING-DING!!! We have a winner!!


Best,

Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru at GMail
Post #1474747
Posted Thursday, July 18, 2013 4:17 PM


Right there with Babe

Right there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with Babe

Group: General Forum Members
Last Login: 2 days ago @ 4:43 AM
Points: 756, Visits: 631
In difference to Kevin, I think that that there is some hope that we may be able to help you in this forum, and save you the cost from hiring a consultant. But no promises, we'll see.

In any case, the difference between the two plans is that in the good plan, the optimizer did partition elimination just fine, and only queried the database for July. In the bad plan, it queries all tables in the partitioned view.

So why would the optimizer do this? I have some theories, but I first need to see the view definition, and the definition of all tables, including their indexes, in the view. (Please post the table scripts as an attachment.)


Erland Sommarskog, SQL Server MVP, www.sommarskog.se
Post #1475233
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse