Index alignment in partitioning

Question

Index alignment in partitioning

Matthew Darwin

SSCertifiable

Points: 5513
More actions
April 1, 2015 at 5:26 am

#296530

Good morning,
First up; I've yet to use partitioning in a production environment, and pretty much last ran any partitioning related code a few years back when looking at certification; so I'm definitely not an expert on the matter and only loosely clued up on the concepts.
I've recently started with a new employer, and they have just implemented a new system for sms messaging. The database tables tracking the sms messages being sent are going to get big and so they have created decided to implement partitioning on some of the tables using a partition scheme on the CreatedDate column; the DBA involved in designing the partitioning has left and I'm picking this up.
The relevant DDL for the table is below:-
CREATE TABLE [Message].[Sms](
[SmsId] [bigint] IDENTITY(250000001,1) NOT NULL,
[CreatedDate] [datetime] NOT NULL CONSTRAINT [DF_Sms:CreatedDate] DEFAULT (getdate()),
CONSTRAINT [PK_Sms:SmsId] PRIMARY KEY NONCLUSTERED
(
[SmsId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
)
CREATE CLUSTERED INDEX [IX:SMS:Clustered] ON [Message].[Sms]
(
[SmsId] ASC,
[CreatedDate] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90)
GO
CREATE NONCLUSTERED INDEX [IX:SMS:CreatedDate] ON [Message].[Sms]
(
[CreatedDate] ASC
)
INCLUDE ( [SmsId]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90)
GO
There are some issues with the above that I will be addressing seperately (e.g. the clustered index should be unique as it contians the unique key, and the fillfactors are daft), but my concerns for this post are below.
1) How to define the Primary Key and enforce it's uniqueness whilst trying to ensure it's aligned with the partition in order to be able to switch out old data once an as yet undefined retention period has passed. In books online it states:- "If it is not possible for the partitioning column to be included in the unique key, you must use a DML trigger instead to enforce uniqueness. " Books online - Special Guidelines for Partitioned Indexes. However, I'm not sure what this means, nor how I create the primary key to use the partition function seeing as it doesn't have the CreatedDate in the unique key?
2) The original partition function was envisaged as the following:-
CREATE PARTITION FUNCTION [DateFunction](datetime) AS RANGE
LEFT FOR VALUES (N'2014-01-01T00:00:00.000'
, N'2014-04-01T00:00:00.000'
, N'2014-07-01T00:00:00.000'
, N'2014-10-01T00:00:00.000'
, N'2015-01-01T00:00:00.000'
, N'2015-04-01T00:00:00.000'
, N'2015-04-02T00:00:00.000'
, N'2015-04-03T00:00:00.000'
, N'2015-04-04T00:00:00.000'
, N'2015-04-05T00:00:00.000')
GO
There is a procedure that has been created and scheduled daily that will create a new partition for each day, and then merge these together at the end of the quarter. My understanding of partitioning is that this is a bad idea, as it will result in merging several populated partitions together. Is my understanding correct? If so, I'm planning on removing the day partitions at the end of the function, and simply adding quarterly partitions, maintaining a spare empty partition at the end of the table. Would this make more sense?
As usual, thanks in advance for any comments/thoughts.
Regards
Matthew
Follow me on twitter @EvoDBACheck out my blog Natural Selection DBA[/url]

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 1

Do you know why they implemented partitioning? What problem is it supposed to solve? Archiving/deleting of old data?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Matthew Darwin SSCertifiable Points: 5513 More actions · Answer 2

Archiving and index maintenance are the two reasons I've been given, though I have an inkling the real reason might simply be wanting to try it out.

However, I suspect that it may well be a sensible candidate as the old table has some 700 million rows, most of which now is unneeded data.

Follow me on twitter @EvoDBACheck out my blog Natural Selection DBA[/url]

BrainDonor SSCoach Points: 19252 More actions · Answer 3

Matthew Darwin (4/1/2015)
Archiving and index maintenance are the two reasons I've been given, though I have an inkling the real reason might simply be wanting to try it out.
However, I suspect that it may well be a sensible candidate as the old table has some 700 million rows, most of which now is unneeded data.

We use partition switching to keep data for a variety of time periods - from 3 days to 15 months, depending upon what we need it for.

You'd be better off having a partition scheme based on age, not the actual date, because you'll have to alter it as time progresses.

Have a look at an example of a three day retention setup,which can be changed to suit the retention you require - http://wp.me/p3Vxvi-6S

Changing a table with 700 million rows to cater for partitioning is going to be interesting.

Edited for typo.

Steve Hall
Linkedin
Blog Site

Matthew Darwin SSCertifiable Points: 5513 More actions · Answer 4

BrainDonor (4/1/2015)
Matthew Darwin (4/1/2015)
Archiving and index maintenance are the two reasons I've been given, though I have an inkling the real reason might simply be wanting to try it out.
However, I suspect that it may well be a sensible candidate as the old table has some 700 million rows, most of which now is unneeded data.
We use partition switching to keep data for a variety of time periods - from 3 days to 15 months, depending upon what we need it for.
You'd be better off having a partition scheme based on age, not the actual date, because you'll have to alter it as time progresses.
Have a look at an example of a three day retention setup,which can be changed to suit the retention you require - http://wp.me/p3Vxvi-6S
Changing a table with 700 million rows to cater for partitioning is going to be interesting.
Edited for typo.

Does that scheme not mean that data is actually required to be moved between partitions, rather than just the partitions themselves remaining static and then being binned off when that data counter hits a specific age?

The old table is not going to be migrated into the new table, this is essentially a new system that will be starting with no data and building up from scratch; however I just want to make sure that the decision already made aren't going to mean that this is a problem going forwards once the table has accumulated that sort of volume of data.

Follow me on twitter @EvoDBACheck out my blog Natural Selection DBA[/url]

BrainDonor SSCoach Points: 19252 More actions · Answer 5

Matthew Darwin (4/1/2015)
Does that scheme not mean that data is actually required to be moved between partitions, rather than just the partitions themselves remaining static and then being binned off when that data counter hits a specific age?
The old table is not going to be migrated into the new table, this is essentially a new system that will be starting with no data and building up from scratch; however I just want to make sure that the decision already made aren't going to mean that this is a problem going forwards once the table has accumulated that sort of volume of data.

No, the data isn't physically moved - its a change to the metadata, so takes an extremely short time to execute. You shouldn't notice the difference between switching and truncating two rows or two million rows.

Steve Hall
Linkedin
Blog Site

Matthew Darwin SSCertifiable Points: 5513 More actions · Answer 6

No I still don't understand; unless I've completely misunderstood, in the scenario you provide the value on the row changes, meaning that value now should be in a different partition from when it was inserted, i.e. it's now two days old rather than one. The data is effectively updated at that point (the partition value 1 needs to be rewritten as value 2, the page has to come off disk into buffer be changed, and put back to disk), which would mean it has to be rewritten to the new partition?

I (...think I...) understand the metadata change for partition switching, but in the scenario above the actual data row itself changes to move it from the first to the second partition? If the data was static and new partition ranges were added, then the underlying data would never have to move between partitions.

More so, in my scenario where I have a wider window, only some of the data would move between the partitions; (e.g. with chunks of data of three months, at month change, only the oldest month would be moved), so there could even be data at a page level that would need to exist within different partitions?

Follow me on twitter @EvoDBACheck out my blog Natural Selection DBA[/url]

BrainDonor SSCoach Points: 19252 More actions · Answer 7

It's OK - it took a while for me to understand it too - hence the blog article. I had to write it all out to understand it and see what it did.

I think the mistake is thinking that the PartitionNumber column is actually the number of days old that the row actually is. It isn't that at all - it's the result of the calculation with the modulo of 3. Data written today (01 Apr 2015) will have a value of 2 - it isn't an indication of the age of the row, just which partition it wants to store it in.

Tomorrow's data will want to be in PartitionNumber 3 - try it:

SELECT abs(datediff(day,CONVERT([date],'20000101',(112)),'20150401')%(3)+(1))

SELECT abs(datediff(day,CONVERT([date],'20000101',(112)),'20150402')%(3)+(1))

So the purging process uses the same calculation, works out that PartitionNumber 3 is required tomorrow and so switches the data out of partition 3 and truncates it, leaving it empty for the following day.

Because there are three partitions it cycles through them each day, always clearing the one that it needs to use the next day (if the job is scheduled correctly).

If you wanted a monthly partition scheme then for each month the calculation would calculate a partition number for the row created at that time (based on 'month' instead of 'day'). It wouldn't actually be the age (in months) of that partition, just the result of the calculation. As long as it is consistent for the month and the purge process uses the same calculation then it will have a monthly partition and purging system.

Steve Hall
Linkedin
Blog Site

Matthew Darwin SSCertifiable Points: 5513 More actions · Answer 8

Aha, light bulb moment! I knew I was missing something there, as much as anything from skim reading rather than properly looking at it.

So, if we were splitting into months it would look like this with 13 partitions, twelve that will hold data that we are currently processing, and a thirteenth for the one we will be purging:-

--show rolling partitions

DECLARE @Tables TABLE

(MonthDate datetime2(0) NOT NULL)

INSERT INTO @Tables

SELECT TOP 13 DATEADD(MONTH, ROW_NUMBER() OVER (ORDER BY object_id) -1, '2015-01-01') AS MonthDate

FROM sys.columns AS c

SELECT MonthDate

, ABS(DATEDIFF(MONTH,0,MonthDate)%13) + 1

FROM @Tables

--code to define the partition on the table

SELECT DATEADD(MONTH, DATEDIFF(MONTH, 0, GETDATE()), 0)

Or the same for quarters but using modulo 5 and quarter in the datefunctions instead of month.

I guess the only risk is that if for some reason the job that runs the partition switch fails/doesn't run for some reason, then you would end up with data going into a partition that was already populated with old data?

Follow me on twitter @EvoDBACheck out my blog Natural Selection DBA[/url]

ScottPletcher SSC Guru Points: 100942 More actions · Answer 9

Cluster the original table on ( CreatedDate, SmsId ), in that order, not vice-versa.

Once you get the best clustering key, you're likely not to need partitioning yet. And, if and when you do, the clustering key will already match your partitioning key.

SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

Matthew Darwin SSCertifiable Points: 5513 More actions · Answer 10

Yes, that's the clustering choice I would have made; in fact I would have made that the pk. However, this is already in existence and has been rolled out to a live environment. As such, any changes to the primary key are now a little more difficult due to referential integrity, other tables are referencing this primary key as a foreign key; along with all the paperwork and controls that need to go into changing a live system.

All I can see as possibilities for the switch considering the pk is purely on the SmsID column are either:-

a) restructure the table to have both columns as a composite pk (or adopt the scenario as mentioned earlier to implement the rolling partitions), which involves doing the same across the board for any foreign key tables. Whilst this is not a small piece of work, the particular system hasn't been in place long so it's easier to do now than it would be later.

b) when it comes to archiving out the data, drop the foreign keys, drop the primary key, archive out the data using the switch (after making sure all the referencing tables have the relevant data archived also), then after the switch recreate the pk and all the foreign keys. This, to me, sounds like a massive ballache and probably negates any gains for fast archiving that the switch would enable.

Also; whilst I appreciate that archiving and partitioning probably aren't required yet, my experience generally is that it's better to plan for it now and implement the mechanisms early, otherwise something isn't thought of and then when it matters it ends up hurting to correct it. Prime example in this case; if I hadn't looked into the partitioning strategy and noticed that the pk cannot be aligned, when it actually came to archiving the data off in twelve months (or whatever) time, then we wouldn't have been able to do as planned using a switch due to the unpartitioned pk, and it would have been a lot more difficult to sort out at that point.

Follow me on twitter @EvoDBACheck out my blog Natural Selection DBA[/url]

ScottPletcher SSC Guru Points: 100942 More actions · Answer 11

Matthew Darwin (4/1/2015)
Yes, that's the clustering choice I would have made; in fact I would have made that the pk. However, this is already in existence and has been rolled out to a live environment. As such, any changes to the primary key are now a little more difficult due to referential integrity, other tables are referencing this primary key as a foreign key; along with all the paperwork and controls that need to go into changing a live system.
All I can see as possibilities for the switch considering the pk is purely on the SmsID column are either:-
a) restructure the table to have both columns as a composite pk (or adopt the scenario as mentioned earlier to implement the rolling partitions), which involves doing the same across the board for any foreign key tables. Whilst this is not a small piece of work, the particular system hasn't been in place long so it's easier to do now than it would be later.
b) when it comes to archiving out the data, drop the foreign keys, drop the primary key, archive out the data using the switch (after making sure all the referencing tables have the relevant data archived also), then after the switch recreate the pk and all the foreign keys. This, to me, sounds like a massive ballache and probably negates any gains for fast archiving that the switch would enable.
Also; whilst I appreciate that archiving and partitioning probably aren't required yet, my experience generally is that it's better to plan for it now and implement the mechanisms early, otherwise something isn't thought of and then when it matters it ends up hurting to correct it. Prime example in this case; if I hadn't looked into the partitioning strategy and noticed that the pk cannot be aligned, when it actually came to archiving the data off in twelve months (or whatever) time, then we wouldn't have been able to do as planned using a switch due to the unpartitioned pk, and it would have been a lot more difficult to sort out at that point.

All excellent points.

Keep in mind though:

1) You can leave the PK solely on the SmsID column and still cluster the table by ( date, id ). As you've noted, better to do it now before it's been in place too long.

2) For archiving, you typically identity what's to be archived first, then archive from the child tables first (from most dependent to least dependent). Thus, you don't have to adjust the PKs / FKs in any way in order to archive data.

SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

Matthew Darwin SSCertifiable Points: 5513 More actions · Answer 12

ScottPletcher (4/1/2015)
All excellent points.
Keep in mind though:
1) You can leave the PK solely on the SmsID column and still cluster the table by ( date, id ). As you've noted, better to do it now before it's been in place too long.
2) For archiving, you typically identity what's to be archived first, then archive from the child tables first (from most dependent to least dependent). Thus, you don't have to adjust the PKs / FKs in any way in order to archive data.

If I leave the pk solely on the SmsID, when I try and run a switch command I receive an error (not at my work desk now to get the exact error message) that indicates I can't run the switch command because the pk is not partitioned. And I can't partition it because that pk does not have the partitioned column in it's index key. That's why I say that to use the switch command to archive the data out, I would need to drop all the foreign keys; not because I want to archive out of this table first, but because I have to drop the non partitioned pk in order to run the switch.

This is also where my confusion over the line in BOL comes in to place; are they simply suggesting to index the column as a non unique key and enforce the uniqueness through a DML trigger?

Follow me on twitter @EvoDBACheck out my blog Natural Selection DBA[/url]

ScottPletcher SSC Guru Points: 100942 More actions · Answer 13

Matthew Darwin (4/1/2015)
ScottPletcher (4/1/2015)
All excellent points.
Keep in mind though:
1) You can leave the PK solely on the SmsID column and still cluster the table by ( date, id ). As you've noted, better to do it now before it's been in place too long.
2) For archiving, you typically identity what's to be archived first, then archive from the child tables first (from most dependent to least dependent). Thus, you don't have to adjust the PKs / FKs in any way in order to archive data.
If I leave the pk solely on the SmsID, when I try and run a switch command I receive an error (not at my work desk now to get the exact error message) that indicates I can't run the switch command because the pk is not partitioned. And I can't partition it because that pk does not have the partitioned column in it's index key. That's why I say that to use the switch command to archive the data out, I would need to drop all the foreign keys; not because I want to archive out of this table first, but because I have to drop the non partitioned pk in order to run the switch.
This is also where my confusion over the line in BOL comes in to place; are they simply suggesting to index the column as a non unique key and enforce the uniqueness through a DML trigger?

Sorry, I didn't realize that about a PK and partitioning. I'm not big on declaring formal PKs, I just use unique indexes. The SmsID is unique by itself though, right? If so, you can create a unique nonclustered index to support FKs to it, without making the SmsID a PK.

SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

Matthew Darwin SSCertifiable Points: 5513 More actions · Answer 14

I believe the same applies with unique indexes also; though I haven't tried to know. I think this would be because there is a b-tree for each partition, but uniqueness would have to be applied across all partitions, this wouldn't be possible unless the unique key contained the partition column.

Follow me on twitter @EvoDBACheck out my blog Natural Selection DBA[/url]