Partition Table by Year

Question

Post reply

Partition Table by Year

Welsh Corgi

SSC Guru

Points: 116520
More actions
January 19, 2016 at 9:13 am

#307762

I'm looking for articles on Partitioning a Table by year.
I have read several and executed the code but I have not really found what I need.
I want to partition by year based on a Column name Call_Date in table tblCall. FYI, I did not name the table.
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply

Welsh Corgi SSC Guru Points: 116520 More actions · Answer 1

The ID Column is the Primary Key.

I archive some records already.

To what extent would that complicate things?

The following years have not been archived:

YearDateRecordCount
19981
20071
20093
20102
20117
2012242464
2013285255
2014355891
2015394239
201618297

The following years have been archived:

YearDateRecordCount
NULL46
189938
19951
19981295
19997967
20006600
200150483
200267137
200370068
200467311
200568502
200668595
200784438
2008104377
2009151171
2010200165
2011232679

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Jeff Moden SSC Guru Points: 1004681 More actions · Answer 2

Welsh Corgi (1/19/2016)
The ID Column is the Primary Key.
I archive some records already.
To what extent would that complicate things?

Possibly, a shedload. What have you used as the clustered index for the partitioned table? And, remember, the PK does NOT have to be the clustered index.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Hugo Kornelis SSC Guru Points: 64790 More actions · Answer 3

If you want to partition by year, then you will first have to make sure that the date column is the leading column in the clustered index. You will probably also want it in all nonclustered indexes, to ensure that you have partition-aligned indexes.

To partition by year, you will want to define the partition boundaries to be Jan 1st of each year. And to make sure that rows with a timestamp of exactly midnight of Jan 1 go into the partition for the new year, specify RANGE RIGHT.

So the syntax would be:

CREATE PARTITION FUNCTION GoodNameGoesHere (datetime)

AS RANGE RIGHT

FOR VALUES ('20120101', '20130101', '20140101', '20150101', '20160101', '20170101');

I have chosen a single partition for all data before 2012 (since that's just a few rows anyway). The last partition (2017 and later) will remain empty. That is a best practice, and I recommend splitting it *before* data starts to appear in this partition. Splitting an empty partition is metadata only, as soon as data exists it becomes much more costly. For the same reason, I recommend archiving the pre-2012 partition (by using partition swapping), but leaving it in place - once you have also archived the 2013 data and beth first partitions are empty, you can merge them - again metadata only and very fast because no data has to be moved.

Since you already found several articles, I assume that you can figure out how to create the partition scheme and set up the table and indexes.

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Welsh Corgi SSC Guru Points: 116520 More actions · Answer 4

Jeff Moden (1/19/2016)
Welsh Corgi (1/19/2016)
The ID Column is the Primary Key.
I archive some records already.
To what extent would that complicate things?
Possibly, a shedload. What have you used as the clustered index for the partitioned table? And, remember, the PK does NOT have to be the clustered index.

It is actually the Call_ID. I did not create the table.

Perhaps the Call_date would be a candidate s the Primary Key.

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Hugo Kornelis SSC Guru Points: 64790 More actions · Answer 5

Welsh Corgi (1/19/2016)
Perhaps the Call_date would be a candidate s the Primary Key.

If you never have more than one call per call_date, then yes it is a candidate key. Otherwise no.

But if you want to partition by year, then you will definitely need to make Call_date the leading column in the clustered index,

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Welsh Corgi SSC Guru Points: 116520 More actions · Answer 6

Jeff Moden (1/19/2016)
Welsh Corgi (1/19/2016)
The ID Column is the Primary Key.
I archive some records already.
To what extent would that complicate things?
Possibly, a shedload. What have you used as the clustered index for the partitioned table? And, remember, the PK does NOT have to be the clustered index.

There are too many indexes on the table, 33.:w00t:

I have not had been here very long so but I'm ready to start dropping indexes.

So keep the non-aligned indexes and create aligned indexes with the Call_Date Column as the leading field in the index?

I really do not think that this table is a good candidate for partitioning.

Thanks Jeff.

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Jeff Moden SSC Guru Points: 1004681 More actions · Answer 7

Welsh Corgi (1/19/2016)
Jeff Moden (1/19/2016)
Welsh Corgi (1/19/2016)
The ID Column is the Primary Key.
I archive some records already.
To what extent would that complicate things?
Possibly, a shedload. What have you used as the clustered index for the partitioned table? And, remember, the PK does NOT have to be the clustered index.
It is actually the Call_ID. I did not create the table.
Perhaps the Call_date would be a candidate s the Primary Key.

You're moving too fast. Slow down a bit and let's do it right the first time.

It doesn't actually matter what the original table looks like for indexes but it's uber important as to what they look like for the partitioned table. Before we do any of that, we also need to clarify a couple of things.

First of all, since this is a "call" table, would it be a true statement that ALL the rows in this table are only written once and are NEVER EVER updated, deleted, or otherwise modified?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Welsh Corgi SSC Guru Points: 116520 More actions · Answer 8

Hugo Kornelis (1/19/2016)
If you want to partition by year, then you will first have to make sure that the date column is the leading column in the clustered index. You will probably also want it in all nonclustered indexes, to ensure that you have partition-aligned indexes.
To partition by year, you will want to define the partition boundaries to be Jan 1st of each year. And to make sure that rows with a timestamp of exactly midnight of Jan 1 go into the partition for the new year, specify RANGE RIGHT.
So the syntax would be:
CREATE PARTITION FUNCTION GoodNameGoesHere (datetime)
AS RANGE RIGHT
FOR VALUES ('20120101', '20130101', '20140101', '20150101', '20160101', '20170101');
I have chosen a single partition for all data before 2012 (since that's just a few rows anyway). The last partition (2017 and later) will remain empty. That is a best practice, and I recommend splitting it *before* data starts to appear in this partition. Splitting an empty partition is metadata only, as soon as data exists it becomes much more costly. For the same reason, I recommend archiving the pre-2012 partition (by using partition swapping), but leaving it in place - once you have also archived the 2013 data and beth first partitions are empty, you can merge them - again metadata only and very fast because no data has to be moved.
Since you already found several articles, I assume that you can figure out how to create the partition scheme and set up the table and indexes.

Thanks for your advice.

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Welsh Corgi SSC Guru Points: 116520 More actions · Answer 9

The records in the call table are updated Regularly.

Thanks.

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

J Livingston SQL SSC Guru Points: 51550 More actions · Answer 10

deja vu

http://www.sqlservercentral.com/Forums/Topic1696588-3077-1.aspx

________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day

Jeff Moden SSC Guru Points: 1004681 More actions · Answer 11

Welsh Corgi (1/19/2016)
The records in the call table are updated Regularly.
Thanks.

So, you're saying that rows for calls placed in 2014 can be updated in 2016? Something is VERY wrong here and I wouldn't touch this table until 1) you understand all of the ramifications and requirements of this table and 2) until you've done a serious deep dive on partitioning, files, filegroups, and "Piecemeal Restores".

{EDIT}... and, yeah... that's going to take some time but it's going to be well worth every precious second.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004681 More actions · Answer 12

J Livingston SQL (1/19/2016)
deja vu
http://www.sqlservercentral.com/Forums/Topic1696588-3077-1.aspx

Heh... I thought this looked familiar.

My advice in my previous post stands, though. 🙂

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Welsh Corgi SSC Guru Points: 116520 More actions · Answer 13

There are only 1,296,391 records but the table is wide.

I recommended that we did not partition the table. There is too much risk among other things but someone thinks that it would be better to partition the table.

Yes the reason they want it is for performance but I have improved the performance.

Good advice and your input should be enough to talk my way out of having to do this.

Thanks. :blush:

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Jeff Moden SSC Guru Points: 1004681 More actions · Answer 14

Welsh Corgi (1/19/2016)
There are only 1,296,391 records but the table is wide.
I recommended that we did not partition the table. There is too much risk among other things but someone thinks that it would be better to partition the table.
Good advice and your input should be enough to talk my way out of having to do this.
Thanks. :blush:

I absolutely agree. 1.3 million rows over such a long period just isn't worth partitioning. I would, however, take a serious look at those 33 separate indexes to see how many of them are actually being used. I understand (wrote 2 worldwide call accounting packages all in T-SQL in the past) the nature of CDRs (Call Detail Records) being wide but 33 indexes seem a bit much.

You can take a look at sys,dm_db_index_usage_stats to start your quest there.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)