Partition Table by Year

  • I'm looking for articles on Partitioning a Table by year.

    I have read several and executed the code but I have not really found what I need.

    I want to partition by year based on a Column name Call_Date in table tblCall. FYI, I did not name the table.

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • The ID Column is the Primary Key.

    I archive some records already.

    To what extent would that complicate things?

    The following years have not been archived:

    YearDateRecordCount

    19981

    20071

    20093

    20102

    20117

    2012242464

    2013285255

    2014355891

    2015394239

    201618297

    The following years have been archived:

    YearDateRecordCount

    NULL46

    189938

    19951

    19981295

    19997967

    20006600

    200150483

    200267137

    200370068

    200467311

    200568502

    200668595

    200784438

    2008104377

    2009151171

    2010200165

    2011232679

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Welsh Corgi (1/19/2016)


    The ID Column is the Primary Key.

    I archive some records already.

    To what extent would that complicate things?

    Possibly, a shedload. What have you used as the clustered index for the partitioned table? And, remember, the PK does NOT have to be the clustered index.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • If you want to partition by year, then you will first have to make sure that the date column is the leading column in the clustered index. You will probably also want it in all nonclustered indexes, to ensure that you have partition-aligned indexes.

    To partition by year, you will want to define the partition boundaries to be Jan 1st of each year. And to make sure that rows with a timestamp of exactly midnight of Jan 1 go into the partition for the new year, specify RANGE RIGHT.

    So the syntax would be:

    CREATE PARTITION FUNCTION GoodNameGoesHere (datetime)

    AS RANGE RIGHT

    FOR VALUES ('20120101', '20130101', '20140101', '20150101', '20160101', '20170101');

    I have chosen a single partition for all data before 2012 (since that's just a few rows anyway). The last partition (2017 and later) will remain empty. That is a best practice, and I recommend splitting it *before* data starts to appear in this partition. Splitting an empty partition is metadata only, as soon as data exists it becomes much more costly. For the same reason, I recommend archiving the pre-2012 partition (by using partition swapping), but leaving it in place - once you have also archived the 2013 data and beth first partitions are empty, you can merge them - again metadata only and very fast because no data has to be moved.

    Since you already found several articles, I assume that you can figure out how to create the partition scheme and set up the table and indexes.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Jeff Moden (1/19/2016)


    Welsh Corgi (1/19/2016)


    The ID Column is the Primary Key.

    I archive some records already.

    To what extent would that complicate things?

    Possibly, a shedload. What have you used as the clustered index for the partitioned table? And, remember, the PK does NOT have to be the clustered index.

    It is actually the Call_ID. I did not create the table.

    Perhaps the Call_date would be a candidate s the Primary Key.

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Welsh Corgi (1/19/2016)


    Perhaps the Call_date would be a candidate s the Primary Key.

    If you never have more than one call per call_date, then yes it is a candidate key. Otherwise no.

    But if you want to partition by year, then you will definitely need to make Call_date the leading column in the clustered index,


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Jeff Moden (1/19/2016)


    Welsh Corgi (1/19/2016)


    The ID Column is the Primary Key.

    I archive some records already.

    To what extent would that complicate things?

    Possibly, a shedload. What have you used as the clustered index for the partitioned table? And, remember, the PK does NOT have to be the clustered index.

    There are too many indexes on the table, 33.:w00t:

    I have not had been here very long so but I'm ready to start dropping indexes.

    So keep the non-aligned indexes and create aligned indexes with the Call_Date Column as the leading field in the index?

    I really do not think that this table is a good candidate for partitioning.

    Thanks Jeff.

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Welsh Corgi (1/19/2016)


    Jeff Moden (1/19/2016)


    Welsh Corgi (1/19/2016)


    The ID Column is the Primary Key.

    I archive some records already.

    To what extent would that complicate things?

    Possibly, a shedload. What have you used as the clustered index for the partitioned table? And, remember, the PK does NOT have to be the clustered index.

    It is actually the Call_ID. I did not create the table.

    Perhaps the Call_date would be a candidate s the Primary Key.

    You're moving too fast. Slow down a bit and let's do it right the first time.

    It doesn't actually matter what the original table looks like for indexes but it's uber important as to what they look like for the partitioned table. Before we do any of that, we also need to clarify a couple of things.

    First of all, since this is a "call" table, would it be a true statement that ALL the rows in this table are only written once and are NEVER EVER updated, deleted, or otherwise modified?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hugo Kornelis (1/19/2016)


    If you want to partition by year, then you will first have to make sure that the date column is the leading column in the clustered index. You will probably also want it in all nonclustered indexes, to ensure that you have partition-aligned indexes.

    To partition by year, you will want to define the partition boundaries to be Jan 1st of each year. And to make sure that rows with a timestamp of exactly midnight of Jan 1 go into the partition for the new year, specify RANGE RIGHT.

    So the syntax would be:

    CREATE PARTITION FUNCTION GoodNameGoesHere (datetime)

    AS RANGE RIGHT

    FOR VALUES ('20120101', '20130101', '20140101', '20150101', '20160101', '20170101');

    I have chosen a single partition for all data before 2012 (since that's just a few rows anyway). The last partition (2017 and later) will remain empty. That is a best practice, and I recommend splitting it *before* data starts to appear in this partition. Splitting an empty partition is metadata only, as soon as data exists it becomes much more costly. For the same reason, I recommend archiving the pre-2012 partition (by using partition swapping), but leaving it in place - once you have also archived the 2013 data and beth first partitions are empty, you can merge them - again metadata only and very fast because no data has to be moved.

    Since you already found several articles, I assume that you can figure out how to create the partition scheme and set up the table and indexes.

    Thanks for your advice.

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • The records in the call table are updated Regularly.

    Thanks.

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • deja vu

    http://www.sqlservercentral.com/Forums/Topic1696588-3077-1.aspx

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • Welsh Corgi (1/19/2016)


    The records in the call table are updated Regularly.

    Thanks.

    So, you're saying that rows for calls placed in 2014 can be updated in 2016? Something is VERY wrong here and I wouldn't touch this table until 1) you understand all of the ramifications and requirements of this table and 2) until you've done a serious deep dive on partitioning, files, filegroups, and "Piecemeal Restores".

    {EDIT}... and, yeah... that's going to take some time but it's going to be well worth every precious second.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Heh... I thought this looked familiar.

    My advice in my previous post stands, though. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • There are only 1,296,391 records but the table is wide.

    I recommended that we did not partition the table. There is too much risk among other things but someone thinks that it would be better to partition the table.

    Yes the reason they want it is for performance but I have improved the performance.

    Good advice and your input should be enough to talk my way out of having to do this.

    Thanks. :blush:

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Welsh Corgi (1/19/2016)


    There are only 1,296,391 records but the table is wide.

    I recommended that we did not partition the table. There is too much risk among other things but someone thinks that it would be better to partition the table.

    Good advice and your input should be enough to talk my way out of having to do this.

    Thanks. :blush:

    I absolutely agree. 1.3 million rows over such a long period just isn't worth partitioning. I would, however, take a serious look at those 33 separate indexes to see how many of them are actually being used. I understand (wrote 2 worldwide call accounting packages all in T-SQL in the past) the nature of CDRs (Call Detail Records) being wide but 33 indexes seem a bit much.

    You can take a look at sys,dm_db_index_usage_stats to start your quest there.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply