June 22, 2015 at 10:30 am
I have some table that need to be partitioned and archive one of the partitions.
I did this in Oracle several years ago but not in SQL Server.
I'm looking for a basic example on how to do this.
I know the basic steps but the examples that I found on the Web were not quite what I'm looking for.
Does anyone have a good article?
Partition an existing SQL Server Table
url]
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
June 22, 2015 at 10:43 am
You'd need to create the required partition schemes/functions and then rebuild your tables' clustered index onto the partition scheme.
Depending on how your table is built you would also have to redefine your clustered index and primary key (to use the column that you want to partition on).
I wrote an introduction to partitioning here:- http://www.sqlservercentral.com/blogs/the-dba-who-came-in-from-the-cold/2014/06/04/partitioning-basics-part-1/
(Sorry for the shameless self promotion).
If you post the tables' DDL statement and the partition scheme + function I can help you implement.
Be warned though, this is not an online operation
June 22, 2015 at 10:48 am
DBA From The Cold (6/22/2015)
You'd need to create the required partition schemes/functions and then rebuild your tables' clustered index onto the partition scheme.Depending on how your table is built you would also have to redefine your clustered index and primary key (to use the column that you want to partition on).
I wrote an introduction to partitioning here:- http://www.sqlservercentral.com/blogs/the-dba-who-came-in-from-the-cold/2014/06/04/partitioning-basics-part-1/
(Sorry for the shameless self promotion).
If you post the tables' DDL statement and the partition scheme + function I can help you implement.
Be warned though, this is not an online operation
Thanks!
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
June 22, 2015 at 12:06 pm
Welsh Corgi (6/22/2015)
DBA From The Cold (6/22/2015)
You'd need to create the required partition schemes/functions and then rebuild your tables' clustered index onto the partition scheme.Depending on how your table is built you would also have to redefine your clustered index and primary key (to use the column that you want to partition on).
I wrote an introduction to partitioning here:- http://www.sqlservercentral.com/blogs/the-dba-who-came-in-from-the-cold/2014/06/04/partitioning-basics-part-1/
(Sorry for the shameless self promotion).
If you post the tables' DDL statement and the partition scheme + function I can help you implement.
Be warned though, this is not an online operation
Thanks!
The first very important question is, which edition do you have? Standard or Enterprise?
The second very important question is what column will you partition by?
After that come some other important questions.
Is the partitioning column a DateTime or ???
Is the table Insert only on the current date or are updates ever allowed? If so, how far back and on which columns?
Why do you want to partition? If it's for reasons of query performance, you'll likely be disappointed.
And, yes... if you have the diskspace, converting an existing table to a partitioned table can certainly be an online evolution with less than just several seconds of blocking in total.
--Jeff Moden
Change is inevitable... Change for the better is not.
June 22, 2015 at 1:09 pm
Jeff Moden (6/22/2015)
Welsh Corgi (6/22/2015)
DBA From The Cold (6/22/2015)
You'd need to create the required partition schemes/functions and then rebuild your tables' clustered index onto the partition scheme.Depending on how your table is built you would also have to redefine your clustered index and primary key (to use the column that you want to partition on).
I wrote an introduction to partitioning here:- http://www.sqlservercentral.com/blogs/the-dba-who-came-in-from-the-cold/2014/06/04/partitioning-basics-part-1/
(Sorry for the shameless self promotion).
If you post the tables' DDL statement and the partition scheme + function I can help you implement.
Be warned though, this is not an online operation
Thanks!
The first very important question is, which edition do you have? Standard or Enterprise?
The second very important question is what column will you partition by?
After that come some other important questions.
Is the partitioning column a DateTime or ???
Is the table Insert only on the current date or are updates ever allowed? If so, how far back and on which columns?
Why do you want to partition? If it's for reasons of query performance, you'll likely be disappointed.
And, yes... if you have the diskspace, converting an existing table to a partitioned table can certainly be an online evolution with less than just several seconds of blocking in total.
It is moving to a 2014 Box with the Enterprise Edition.
I was going to partition by Creation Date.
Inserts only, no updates.
They want me to partition because the table is so large. I had suggested archiving and deleting the records but it was mention that the log file would get too large.
I mentioned that if you use the TOP operator, a While loop and commit the records you would not have the issue of the Transaction log growing. I wrote code to do this several years ago but I don't have it any more.
Thanks Jeff.
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
June 22, 2015 at 1:11 pm
Here is the data in that CreateDate Column:
CreationDateYearRecordCount
NULL4439
19301
19401
19903
19975880
199863068
199985365
200082985
2001151719
2002255011
2003386974
2004408519
2005525129
2006638583
2007852965
20081262168
20092022856
20102974695
20113285952
20123631320
20134838442
20143982934
2015210
20162
20171
20181
20191
20204
20291
20311
20502
20771
20881
21022
21037
21042
21122
22011
23041
25552
29121
29131
30135
30141
30231
50081
50111
62011
70131
72011
82011
92011
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
June 22, 2015 at 1:42 pm
Currently there are 3 File Groups. Primary, FG1 and FG2.
FG1 is the default.
What impact does have on creating file groups for the partition, if any?
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
June 22, 2015 at 9:53 pm
To be absolutely honest, the table has only about 25 million rows in it and, unless it's a very wide table or unless it has some massive data stored in MAX datatypes, it seems like it would be much more of a pain to go through partitioning than it would to do the looping deletes that you speak of. Most of the years have less than a million rows that could easily be deleted by month, one per night, that would prevent any type of logfile blowout. Even your biggest year's worth of data contains less than 5 million rows and, if you divide that by 12 months, that's only 416,000 rows per month, give or take any "seasonal" adjustments.
This table certainly could be partitioned but is management sure they want to go through such a thing instead of the partial deletes over time that you so correctly cited? They should also be made aware that partitioning is not a panacea of performance for most queries.
Could you run sp_SpaceUsed on the table and post the results so that I can get a better idea of the space used by the table and its indexes?
Also, it would appear that 2015, which is half over, only has 210 rows in it? Is that, by any chance, a mistake or is it due to a fiscal year offset?
--Jeff Moden
Change is inevitable... Change for the better is not.
June 23, 2015 at 1:54 am
Jeff Moden (6/22/2015)
Welsh Corgi (6/22/2015)
DBA From The Cold (6/22/2015)
You'd need to create the required partition schemes/functions and then rebuild your tables' clustered index onto the partition scheme.Depending on how your table is built you would also have to redefine your clustered index and primary key (to use the column that you want to partition on).
I wrote an introduction to partitioning here:- http://www.sqlservercentral.com/blogs/the-dba-who-came-in-from-the-cold/2014/06/04/partitioning-basics-part-1/
(Sorry for the shameless self promotion).
If you post the tables' DDL statement and the partition scheme + function I can help you implement.
Be warned though, this is not an online operation
Thanks!
The first very important question is, which edition do you have? Standard or Enterprise?
The second very important question is what column will you partition by?
After that come some other important questions.
Is the partitioning column a DateTime or ???
Is the table Insert only on the current date or are updates ever allowed? If so, how far back and on which columns?
Why do you want to partition? If it's for reasons of query performance, you'll likely be disappointed.
And, yes... if you have the diskspace, converting an existing table to a partitioned table can certainly be an online evolution with less than just several seconds of blocking in total.
Just picking up on something that Jeff said, partitioning is not a performance tool it is for data management. If your colleagues think that by implementing partitioning on the table their query performance will improve then (again as Jeff says) 99 times out of 100, they'll be disappointed.
June 23, 2015 at 5:44 am
Welsh Corgi (6/22/2015)
Here is the data in that CreateDate Column:CreationDateYearRecordCount
NULL4439
19301
19401
19903
19975880
199863068
199985365
200082985
2001151719
2002255011
2003386974
2004408519
2005525129
2006638583
2007852965
20081262168
20092022856
20102974695
20113285952
20123631320
20134838442
20143982934
2015210
20162
20171
20181
20191
20204
20291
20311
20502
20771
20881
21022
21037
21042
21122
22011
23041
25552
29121
29131
30135
30141
30231
50081
50111
62011
70131
72011
82011
92011
They do not want it for performance reasons. They want it to be a way of archiving data.
With the Data being out of wack I was thinking that I need to update those date fields to a valid Date.
I'm not sure about the Partition Function syntax. Do I do a RANGE RIGHT so that I can archive the old data?
I'm partitioning on the Creation Date Column.
[/
CREATE TABLE [dbo].[TaskNote](
[TaskNoteId] [int] NOT NULL,
[ActiveTaskId] [int] NULL,
[TaskStateId] [int] NULL,
[CreationDate] [datetime] NULL,
[ResourceId] [int] NULL,
[CallId] [int] NULL,
[Memo] [text] NULL,
[sentToXactFlg] [bit] NULL,
CONSTRAINT [PK_TaskNote] PRIMARY KEY CLUSTERED
(
[TaskNoteId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [FG1]
) ON [FG1] TEXTIMAGE_ON [PRIMARY]
GO
ALTER TABLE [dbo].[TaskNote] ADD CONSTRAINT [DF_TaskNote_sentToXactFlg] DEFAULT (0) FOR [sentToXactFlg]
GO
code]
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
June 23, 2015 at 6:59 am
OK so I've run through this on my dev instance.
DISCLAIMER - Please don't run this without testing it first (I know you won't but just need to say it 🙂 )
So first create the partition function and scheme:-
--Create partition function
DECLARE @CurrentDate DATETIME = DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), 0);
CREATE PARTITION FUNCTION PF_TaskNote (DATETIME)
AS RANGE RIGHT
FOR VALUES (@CurrentDate+7,@CurrentDate+6,@CurrentDate+5,@CurrentDate+4,
@CurrentDate+3,@CurrentDate+2,@CurrentDate+1,@CurrentDate,
@CurrentDate-1,@CurrentDate-2,@CurrentDate-3,@CurrentDate-4,
@CurrentDate-5,@CurrentDate-6,@CurrentDate-7,@CurrentDate-8);
--Create partition scheme
CREATE PARTITION SCHEME PS_TaskNote
AS PARTITION PF_TaskNote
ALL TO (FG1);
For ease I created each partition on the FG1 filegroup but you can change as needed. I've used the RANGE RIGHT as that the above script will set the partition boundaries as:-
x < 2015-06-15 00:00:00
2015-06-16 00:00:00 <= x < 2015-06-17 00:00:00
2015-06-17 00:00:00 <= x < 2015-06-18 00:00:00
So that makes each partition one full day.
Then I set the CreationDate column to not null (you'll need to check that this is OK):-
ALTER TABLE [dbo].[TaskNote] ALTER COLUMN [CreationDate] [DATETIME] NOT NULL;
Now I'm going to drop the existing PK and re-create as nonclustered (this is why it's not an online operation so be careful!):-
--Drop existing PK
ALTER TABLE [dbo].[TaskNote] DROP CONSTRAINT [PK_TaskNote];
GO
--Recreate PK as non clustered index
ALTER TABLE [dbo].[TaskNote] ADD CONSTRAINT [PK_TaskNote] PRIMARY KEY NONCLUSTERED
(
[TaskNoteId],[CreationDate] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF,
IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON PS_TaskNote(CreationDate);
GO
I'm assuming that you want to keep the existing PK. I've had to add the CreationDate column as the partitioning key must be part of each unique constraint on the column.
Finally I'm going to add a clustered index to the table on CreationDate:-
--Create clustered index on CreationDate column
CREATE CLUSTERED INDEX IX_TaskNote_CreationDate ON [dbo].[TaskNote]
(
CreationDate ASC
) ON PS_TaskNote(CreationDate);
You can check the partitions then by running:-
SELECT
t.name AS TableName, i.name AS IndexName, p.partition_number, p.partition_id,
r.boundary_id, r.value AS BoundaryValue, p.rows
FROM
sys.tables AS t
INNER JOIN
sys.indexes AS i ON t.object_id = i.object_id
INNER JOIN
sys.partitions AS p ON i.object_id = p.object_id AND i.index_id = p.index_id
INNER JOIN
sys.allocation_units a ON a.container_id = p.hobt_id
INNER JOIN
sys.filegroups fg ON fg.data_space_id = a.data_space_id
INNER JOIN
sys.partition_schemes AS s ON i.data_space_id = s.data_space_id
INNER JOIN
sys.partition_functions AS f ON s.function_id = f.function_id
LEFT OUTER JOIN
sys.partition_range_values AS r ON f.function_id = r.function_id
AND r.boundary_id = p.partition_number
WHERE
t.name = 'TaskNote'
AND
i.type <= 1
AND
a.type = 1 --in row data only
So run through that and let me know if you have any questions. If anyone on here has a better way of doing it, please let me know 🙂
June 23, 2015 at 7:04 am
I'd do RANGE RIGHT because it seems to be the method that humans understand where the partition date is the first date of the partition rather than the last.
I have a presentation with a full up working code example and a very handy view that tells you everything about all partitions in a database. I'd also make one file per filegroup and one filegroup per month to greatly decrease the backup requirements. I dig that out of my archives and attach all that this morning.
--Jeff Moden
Change is inevitable... Change for the better is not.
June 23, 2015 at 7:31 am
Actually, DBA from the Cold provided a pretty good summary for you to easily follow. I'm not begging off but my presentation material is a whole lot more complicated because I also wrote a fairly complicated bit of code to do file compaction of free space during the final index rebuilds before making a partition Read_Only for backup consolidation. The code is the result of me partitioning a 350GB table across 70 months and prevented about 70GB of wasted space and I had to do it all with virtually no downtime. I strongly suspect that you won't have such a problem.
--Jeff Moden
Change is inevitable... Change for the better is not.
June 23, 2015 at 8:10 am
Thanks!:-)
I did not get the example on the Date Ranges?
Did you see where I posted the records with GROUP BY Year and a record count for each year.
If I create the ranges by month, I will have a lot of partitions, I believe 192. If I do it by Year I will have 16.
Thanks again.
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
June 23, 2015 at 8:15 am
Yep... Saw that. You'd be better off partitioning by year.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 22 total)
You must be logged in to reply to this topic. Login to reply