Partition an existing SQL Server Table

  • I have some table that need to be partitioned and archive one of the partitions.

    I did this in Oracle several years ago but not in SQL Server.

    I'm looking for a basic example on how to do this.

    I know the basic steps but the examples that I found on the Web were not quite what I'm looking for.

    Does anyone have a good article?

    [/

    Partition an existing SQL Server Table

    url]

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • You'd need to create the required partition schemes/functions and then rebuild your tables' clustered index onto the partition scheme.

    Depending on how your table is built you would also have to redefine your clustered index and primary key (to use the column that you want to partition on).

    I wrote an introduction to partitioning here:- http://www.sqlservercentral.com/blogs/the-dba-who-came-in-from-the-cold/2014/06/04/partitioning-basics-part-1/

    (Sorry for the shameless self promotion).

    If you post the tables' DDL statement and the partition scheme + function I can help you implement.

    Be warned though, this is not an online operation

  • DBA From The Cold (6/22/2015)


    You'd need to create the required partition schemes/functions and then rebuild your tables' clustered index onto the partition scheme.

    Depending on how your table is built you would also have to redefine your clustered index and primary key (to use the column that you want to partition on).

    I wrote an introduction to partitioning here:- http://www.sqlservercentral.com/blogs/the-dba-who-came-in-from-the-cold/2014/06/04/partitioning-basics-part-1/

    (Sorry for the shameless self promotion).

    If you post the tables' DDL statement and the partition scheme + function I can help you implement.

    Be warned though, this is not an online operation

    Thanks!

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Welsh Corgi (6/22/2015)


    DBA From The Cold (6/22/2015)


    You'd need to create the required partition schemes/functions and then rebuild your tables' clustered index onto the partition scheme.

    Depending on how your table is built you would also have to redefine your clustered index and primary key (to use the column that you want to partition on).

    I wrote an introduction to partitioning here:- http://www.sqlservercentral.com/blogs/the-dba-who-came-in-from-the-cold/2014/06/04/partitioning-basics-part-1/

    (Sorry for the shameless self promotion).

    If you post the tables' DDL statement and the partition scheme + function I can help you implement.

    Be warned though, this is not an online operation

    Thanks!

    The first very important question is, which edition do you have? Standard or Enterprise?

    The second very important question is what column will you partition by?

    After that come some other important questions.

    Is the partitioning column a DateTime or ???

    Is the table Insert only on the current date or are updates ever allowed? If so, how far back and on which columns?

    Why do you want to partition? If it's for reasons of query performance, you'll likely be disappointed.

    And, yes... if you have the diskspace, converting an existing table to a partitioned table can certainly be an online evolution with less than just several seconds of blocking in total.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (6/22/2015)


    Welsh Corgi (6/22/2015)


    DBA From The Cold (6/22/2015)


    You'd need to create the required partition schemes/functions and then rebuild your tables' clustered index onto the partition scheme.

    Depending on how your table is built you would also have to redefine your clustered index and primary key (to use the column that you want to partition on).

    I wrote an introduction to partitioning here:- http://www.sqlservercentral.com/blogs/the-dba-who-came-in-from-the-cold/2014/06/04/partitioning-basics-part-1/

    (Sorry for the shameless self promotion).

    If you post the tables' DDL statement and the partition scheme + function I can help you implement.

    Be warned though, this is not an online operation

    Thanks!

    The first very important question is, which edition do you have? Standard or Enterprise?

    The second very important question is what column will you partition by?

    After that come some other important questions.

    Is the partitioning column a DateTime or ???

    Is the table Insert only on the current date or are updates ever allowed? If so, how far back and on which columns?

    Why do you want to partition? If it's for reasons of query performance, you'll likely be disappointed.

    And, yes... if you have the diskspace, converting an existing table to a partitioned table can certainly be an online evolution with less than just several seconds of blocking in total.

    It is moving to a 2014 Box with the Enterprise Edition.

    I was going to partition by Creation Date.

    Inserts only, no updates.

    They want me to partition because the table is so large. I had suggested archiving and deleting the records but it was mention that the log file would get too large.

    I mentioned that if you use the TOP operator, a While loop and commit the records you would not have the issue of the Transaction log growing. I wrote code to do this several years ago but I don't have it any more.

    Thanks Jeff.

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Here is the data in that CreateDate Column:

    CreationDateYearRecordCount

    NULL4439

    19301

    19401

    19903

    19975880

    199863068

    199985365

    200082985

    2001151719

    2002255011

    2003386974

    2004408519

    2005525129

    2006638583

    2007852965

    20081262168

    20092022856

    20102974695

    20113285952

    20123631320

    20134838442

    20143982934

    2015210

    20162

    20171

    20181

    20191

    20204

    20291

    20311

    20502

    20771

    20881

    21022

    21037

    21042

    21122

    22011

    23041

    25552

    29121

    29131

    30135

    30141

    30231

    50081

    50111

    62011

    70131

    72011

    82011

    92011

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Currently there are 3 File Groups. Primary, FG1 and FG2.

    FG1 is the default.

    What impact does have on creating file groups for the partition, if any?

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • To be absolutely honest, the table has only about 25 million rows in it and, unless it's a very wide table or unless it has some massive data stored in MAX datatypes, it seems like it would be much more of a pain to go through partitioning than it would to do the looping deletes that you speak of. Most of the years have less than a million rows that could easily be deleted by month, one per night, that would prevent any type of logfile blowout. Even your biggest year's worth of data contains less than 5 million rows and, if you divide that by 12 months, that's only 416,000 rows per month, give or take any "seasonal" adjustments.

    This table certainly could be partitioned but is management sure they want to go through such a thing instead of the partial deletes over time that you so correctly cited? They should also be made aware that partitioning is not a panacea of performance for most queries.

    Could you run sp_SpaceUsed on the table and post the results so that I can get a better idea of the space used by the table and its indexes?

    Also, it would appear that 2015, which is half over, only has 210 rows in it? Is that, by any chance, a mistake or is it due to a fiscal year offset?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (6/22/2015)


    Welsh Corgi (6/22/2015)


    DBA From The Cold (6/22/2015)


    You'd need to create the required partition schemes/functions and then rebuild your tables' clustered index onto the partition scheme.

    Depending on how your table is built you would also have to redefine your clustered index and primary key (to use the column that you want to partition on).

    I wrote an introduction to partitioning here:- http://www.sqlservercentral.com/blogs/the-dba-who-came-in-from-the-cold/2014/06/04/partitioning-basics-part-1/

    (Sorry for the shameless self promotion).

    If you post the tables' DDL statement and the partition scheme + function I can help you implement.

    Be warned though, this is not an online operation

    Thanks!

    The first very important question is, which edition do you have? Standard or Enterprise?

    The second very important question is what column will you partition by?

    After that come some other important questions.

    Is the partitioning column a DateTime or ???

    Is the table Insert only on the current date or are updates ever allowed? If so, how far back and on which columns?

    Why do you want to partition? If it's for reasons of query performance, you'll likely be disappointed.

    And, yes... if you have the diskspace, converting an existing table to a partitioned table can certainly be an online evolution with less than just several seconds of blocking in total.

    Just picking up on something that Jeff said, partitioning is not a performance tool it is for data management. If your colleagues think that by implementing partitioning on the table their query performance will improve then (again as Jeff says) 99 times out of 100, they'll be disappointed.

  • Welsh Corgi (6/22/2015)


    Here is the data in that CreateDate Column:

    CreationDateYearRecordCount

    NULL4439

    19301

    19401

    19903

    19975880

    199863068

    199985365

    200082985

    2001151719

    2002255011

    2003386974

    2004408519

    2005525129

    2006638583

    2007852965

    20081262168

    20092022856

    20102974695

    20113285952

    20123631320

    20134838442

    20143982934

    2015210

    20162

    20171

    20181

    20191

    20204

    20291

    20311

    20502

    20771

    20881

    21022

    21037

    21042

    21122

    22011

    23041

    25552

    29121

    29131

    30135

    30141

    30231

    50081

    50111

    62011

    70131

    72011

    82011

    92011

    They do not want it for performance reasons. They want it to be a way of archiving data.

    With the Data being out of wack I was thinking that I need to update those date fields to a valid Date.

    I'm not sure about the Partition Function syntax. Do I do a RANGE RIGHT so that I can archive the old data?

    I'm partitioning on the Creation Date Column.

    [/

    CREATE TABLE [dbo].[TaskNote](

    [TaskNoteId] [int] NOT NULL,

    [ActiveTaskId] [int] NULL,

    [TaskStateId] [int] NULL,

    [CreationDate] [datetime] NULL,

    [ResourceId] [int] NULL,

    [CallId] [int] NULL,

    [Memo] [text] NULL,

    [sentToXactFlg] [bit] NULL,

    CONSTRAINT [PK_TaskNote] PRIMARY KEY CLUSTERED

    (

    [TaskNoteId] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [FG1]

    ) ON [FG1] TEXTIMAGE_ON [PRIMARY]

    GO

    ALTER TABLE [dbo].[TaskNote] ADD CONSTRAINT [DF_TaskNote_sentToXactFlg] DEFAULT (0) FOR [sentToXactFlg]

    GO

    code]

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • OK so I've run through this on my dev instance.

    DISCLAIMER - Please don't run this without testing it first (I know you won't but just need to say it 🙂 )

    So first create the partition function and scheme:-

    --Create partition function

    DECLARE @CurrentDate DATETIME = DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), 0);

    CREATE PARTITION FUNCTION PF_TaskNote (DATETIME)

    AS RANGE RIGHT

    FOR VALUES (@CurrentDate+7,@CurrentDate+6,@CurrentDate+5,@CurrentDate+4,

    @CurrentDate+3,@CurrentDate+2,@CurrentDate+1,@CurrentDate,

    @CurrentDate-1,@CurrentDate-2,@CurrentDate-3,@CurrentDate-4,

    @CurrentDate-5,@CurrentDate-6,@CurrentDate-7,@CurrentDate-8);

    --Create partition scheme

    CREATE PARTITION SCHEME PS_TaskNote

    AS PARTITION PF_TaskNote

    ALL TO (FG1);

    For ease I created each partition on the FG1 filegroup but you can change as needed. I've used the RANGE RIGHT as that the above script will set the partition boundaries as:-

    x < 2015-06-15 00:00:00

    2015-06-16 00:00:00 <= x < 2015-06-17 00:00:00

    2015-06-17 00:00:00 <= x < 2015-06-18 00:00:00

    So that makes each partition one full day.

    Then I set the CreationDate column to not null (you'll need to check that this is OK):-

    ALTER TABLE [dbo].[TaskNote] ALTER COLUMN [CreationDate] [DATETIME] NOT NULL;

    Now I'm going to drop the existing PK and re-create as nonclustered (this is why it's not an online operation so be careful!):-

    --Drop existing PK

    ALTER TABLE [dbo].[TaskNote] DROP CONSTRAINT [PK_TaskNote];

    GO

    --Recreate PK as non clustered index

    ALTER TABLE [dbo].[TaskNote] ADD CONSTRAINT [PK_TaskNote] PRIMARY KEY NONCLUSTERED

    (

    [TaskNoteId],[CreationDate] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF,

    IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON PS_TaskNote(CreationDate);

    GO

    I'm assuming that you want to keep the existing PK. I've had to add the CreationDate column as the partitioning key must be part of each unique constraint on the column.

    Finally I'm going to add a clustered index to the table on CreationDate:-

    --Create clustered index on CreationDate column

    CREATE CLUSTERED INDEX IX_TaskNote_CreationDate ON [dbo].[TaskNote]

    (

    CreationDate ASC

    ) ON PS_TaskNote(CreationDate);

    You can check the partitions then by running:-

    SELECT

    t.name AS TableName, i.name AS IndexName, p.partition_number, p.partition_id,

    r.boundary_id, r.value AS BoundaryValue, p.rows

    FROM

    sys.tables AS t

    INNER JOIN

    sys.indexes AS i ON t.object_id = i.object_id

    INNER JOIN

    sys.partitions AS p ON i.object_id = p.object_id AND i.index_id = p.index_id

    INNER JOIN

    sys.allocation_units a ON a.container_id = p.hobt_id

    INNER JOIN

    sys.filegroups fg ON fg.data_space_id = a.data_space_id

    INNER JOIN

    sys.partition_schemes AS s ON i.data_space_id = s.data_space_id

    INNER JOIN

    sys.partition_functions AS f ON s.function_id = f.function_id

    LEFT OUTER JOIN

    sys.partition_range_values AS r ON f.function_id = r.function_id

    AND r.boundary_id = p.partition_number

    WHERE

    t.name = 'TaskNote'

    AND

    i.type <= 1

    AND

    a.type = 1 --in row data only

    So run through that and let me know if you have any questions. If anyone on here has a better way of doing it, please let me know 🙂

  • I'd do RANGE RIGHT because it seems to be the method that humans understand where the partition date is the first date of the partition rather than the last.

    I have a presentation with a full up working code example and a very handy view that tells you everything about all partitions in a database. I'd also make one file per filegroup and one filegroup per month to greatly decrease the backup requirements. I dig that out of my archives and attach all that this morning.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Actually, DBA from the Cold provided a pretty good summary for you to easily follow. I'm not begging off but my presentation material is a whole lot more complicated because I also wrote a fairly complicated bit of code to do file compaction of free space during the final index rebuilds before making a partition Read_Only for backup consolidation. The code is the result of me partitioning a 350GB table across 70 months and prevented about 70GB of wasted space and I had to do it all with virtually no downtime. I strongly suspect that you won't have such a problem.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks!:-)

    I did not get the example on the Date Ranges?

    Did you see where I posted the records with GROUP BY Year and a record count for each year.

    If I create the ranges by month, I will have a lot of partitions, I believe 192. If I do it by Year I will have 16.

    Thanks again.

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Yep... Saw that. You'd be better off partitioning by year.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 22 total)

You must be logged in to reply to this topic. Login to reply