Blog Post

Partitioning 3: Beyond just partition elimination

,

In Partitioning 2, I showed how to analyze which partitions were accessed by our Index Seek. However, we were searching the entire year’s partition for data. What if we filtered more specifically on the partitioning key?

Yesterday’s query: redux

Taking the same query, but this time let’s just search for a single month of CreationDates. In part 2, the query for the entire year read 130 pages of the Posts table, and now we just want September:

SET STATISTICS IO ON;
SELECT Score FROM Posts_Partitioned as p 
JOIN Users as u on u.Id = p.LastEditorUserId
WHERE u.DisplayName = 'Community'
AND p.CreationDate >= '2009-09-01' and p.CreationDate < '2009-10-01'
/*
(3590 rows affected)
Table 'Posts_Partitioned'. Scan count 1, logical reads 130,
 physical reads 0, read-ahead reads 0, lob logical reads 0, 
lob physical reads 0, lob read-ahead reads 0.
Table 'Users'. Scan count 1, logical reads 3, physical reads 0,
 read-ahead reads 0, lob logical reads 0, lob physical reads 0,
 lob read-ahead reads 0.
*/

Here’s the execution plan for the new query as well:

Details on the Posts_Partitioned Index Seek:

It’s a very similar execution plan, and the query looks fast. However, even though it returned less rows than the query in Partitioning 2, it still read the same amount of pages, 130.

Index design

At this point, we need to look at the index design. The index being used is defined as:

CREATE NONCLUSTERED INDEX ix_Posts_Partitioned_LastEditorUserId 
ON Posts_Partitioned(LastEditorUserId)
INCLUDE (Score) 
ON Posts_Partition_Scheme(CreationDate)
GO

So, let’s try changing the keys to see if we can get a better execution plan. First, let’s add CreationDate to the keys.

CREATE NONCLUSTERED INDEX ix_Posts_Partitioned_LastEditorUserId_CreationDate
ON Posts_Partitioned(LastEditorUserId,CreationDate)
INCLUDE (Score) 
ON Posts_Partition_Scheme(CreationDate)
GO

And re-running our query:

SET STATISTICS IO ON;
SELECT Score FROM Posts_Partitioned as p 
JOIN Users as u on u.Id = p.LastEditorUserId
WHERE u.DisplayName = 'Community'
AND p.CreationDate >= '2009-09-01' and p.CreationDate < '2009-10-01'
/*
(3590 rows affected)
Table 'Posts_Partitioned'. Scan count 1, logical reads 16,
 physical reads 0, read-ahead reads 0, lob logical reads 0,
 lob physical reads 0, lob read-ahead reads 0.
Table 'Users'. Scan count 1, logical reads 3, physical reads 0,
 read-ahead reads 0, lob logical reads 0, lob physical reads 0,
 lob read-ahead reads 0.
*/

That’s a lot better! And here’s the index seek’s information:

Click image for better quality

Even better. Let’s compare these two execution plans. The left side is with the original index, the right side is after, with the CreationDate key added.

What’s the moral of this post?

The point I want to make is that post-partitioning, you may have to re-think some of your existing indexes. Even if you add the partitioning key to all your queries, that can change your workload enough that you’ll want to examine your indexes.

Stay tuned!

P.S. What if we put CreationDate as the first key in the index?

I’ll drop the index used in this example, and create an index with CreationDate as the first key column.

DROP INDEX ix_Posts_Partitioned_LastEditorUserId_CreationDate 
ON Posts_Partitioned
GO
DROP INDEX ix_Posts_Partitioned_LastEditorUserId 
ON Posts_Partitioned
GO
CREATE NONCLUSTERED INDEX ix_Posts_Partitioned_CreationDate_LastEditorUser
ON Posts_Partitioned(CreationDate,LastEditorUserId)
INCLUDE (Score) 
ON Posts_Partition_Scheme(CreationDate)
GO

And now we re-run the query from above. Let’s take a look at the stats and the execution plan:

SET STATISTICS IO ON;
SELECT Score FROM Posts_Partitioned as p 
JOIN Users as u on u.Id = p.LastEditorUserId
WHERE u.DisplayName = 'Community'
AND p.CreationDate >= '2009-09-01' and p.CreationDate < '2009-10-01'
/*
(3590 rows affected)
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0,
 read-ahead reads 0, lob logical reads 0, lob physical reads 0, 
lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, 
read-ahead reads 0, lob logical reads 0, lob physical reads 0, 
lob read-ahead reads 0.
Table 'Posts_Partitioned'. Scan count 1, logical reads 383,
 physical reads 0, read-ahead reads 1, lob logical reads 0,
 lob physical reads 0, lob read-ahead reads 0.
Table 'Users'. Scan count 1, logical reads 3, physical reads 0,
 read-ahead reads 0, lob logical reads 0, lob physical reads 0,
 lob read-ahead reads 0.
*/

Ouch, that’s a lot more reads!

Click image to zoom in

So with the key order of CreationDate first, our query got even worse than before! I hope this was an interesting side note.

Stay tuned!

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating