Partition key should be part of Primary key

  • Hi,

    I would like to know, when I'm creating a partition on a table, partition field should be part of primary key or unique key?

    More specifically, I have table with existing 2 columns (example ColumnA, ColumnB) in the primary key. I would like create a partition on ColumnC. Is it necessary to add ColumnC as a part of primary key?

    Any comments will be appreciated.

  • I wouldn't think so. If, for instance, you partition on date, you won't want the date to be in the primary key. The partition value is not used to uniquely identify the record, but merely to determine its location on the partitioning scheme. I would choose to leave them separate as there is no gain by adding the partition column to the primary key.

  • Logically, it does NOT make sense to add partition column as part of primary key and I know databases like Oracle has no such limitation. But a SQLServer DBA insists, SQLServer partition design require to have partition key as part of primary key.

  • This is just wrong. I know of no requirement like that. Partitioning has nothing to do with referential integrity and it shouldn't even enter into the discussion.

  • No it does not need to be part of the primary key. In fact most partitions are not on the primary key but on some date field. Which is what Jeff was trying to say above I think. So keep your primary key as-is.

  • Hey Guys

    I'm currently trying to create a table with a partition key that is not part of the primary key, but afetr I create the partition function,scheme ansd table and insert values into the table, they are not getting partitioned. I have provided the scripts below

    --- Create Partition Range Function

    CREATE PARTITION FUNCTION TestDB_PartitionRange (datetime)

    AS RANGE LEFT FOR

    VALUES ('2010-10-21');

    GO

    --- Attach Partition Scheme to FileGroups

    CREATE PARTITION SCHEME TestDB_PartitionScheme

    AS PARTITION TestDB_PartitionRange

    TO ([PRIMARY], Secondary);

    GO

    --Create table on partition scheme

    CREATE TABLE TestTable

    (ID INT NOT NULL,

    Date DATETIME

    ,CONSTRAINT [pc_work_PK_new] PRIMARY KEY CLUSTERED ([ID] ASC ) on [primary]

    )ON TestDB_PartitionScheme (Date);

    GO

    --- Insert Data in Partitioned Table

    INSERT INTO TestTable (ID, Date) -- Inserted in Partition 1

    VALUES (1,'2010-10-20');

    INSERT INTO TestTable (ID, Date) -- Inserted in Partition 2

    VALUES (11,'2010-10-21');

    INSERT INTO TestTable (ID, Date) -- Inserted in Partition 2

    VALUES (12,'2010-10-22');

    GO

    --- Verify Rows Inserted in Partitions

    SELECT *

    FROM sys.partitions

    WHERE OBJECT_NAME(OBJECT_ID)='TestTable';

    GO

    When I verify that rows are in the different partitons, they are not.

    Any help appreciated

  • Hi,

    try this:

    --- Create Partition Range Function

    CREATE PARTITION FUNCTION TestDB_PartitionRange (datetime)

    AS RANGE LEFT FOR

    VALUES ('2010-10-21');

    GO

    --- Attach Partition Scheme to FileGroups

    CREATE PARTITION SCHEME TestDB_PartitionScheme

    AS PARTITION TestDB_PartitionRange

    TO ([PRIMARY], [Secondary]);

    GO

    --Create table on partition scheme

    CREATE TABLE TestTable

    (ID INT NOT NULL,

    Date DATETIME

    )ON TestDB_PartitionScheme (Date);

    GO

    CREATE CLUSTERED INDEX [IX_TestTable] ON TestTable([ID] ASC, [Date] ASC) ON TestDB_PartitionScheme(Date)

    GO

    ALTER TABLE TestTable ADD CONSTRAINT [pc_work_PK_new] PRIMARY KEY ([ID] ASC ) ON [PRIMARY]

    GO

    --- Insert Data in Partitioned Table

    INSERT INTO TestTable (ID, Date) -- Inserted in Partition 1

    VALUES (1,'2010-10-20');

    INSERT INTO TestTable (ID, Date) -- Inserted in Partition 2

    VALUES (11,'2010-10-21');

    INSERT INTO TestTable (ID, Date) -- Inserted in Partition 2

    VALUES (12,'2010-10-22');

    GO

    --- Verify Rows Inserted in Partitions

    SELECT *

    FROM sys.partitions

    WHERE OBJECT_NAME(OBJECT_ID)='TestTable';

    GO

    I don't know if it's the best solution, but it works.

  • I know this is an old post, but completeness of this thread I would like to add the below comments.

    DBA as well as Jeff.Mason was partially correct in the sense that normally by default primary keys are clustered. Eventhough it is not a requirement for a partition column NOT to be in the primary key, but best practice is for a partition column to be part of the clustered index. Without this, i.e. partition key not in the clustered index, and since in SQL Server, all indexes and data sitting on top of clustered index, SQL Engine will always perform a table-scan and not partition scan and hence creating a partition is not affective at all.

    Cheers,

    Fredy

  • jeff.mason (10/4/2010)


    This is just wrong. I know of no requirement like that. Partitioning has nothing to do with referential integrity and it shouldn't even enter into the discussion.

    I also understand that this is an old post but need to comment on the above now that I've seen it.

    I agree that you don't need to add the partitioning column to the PK (a unique index under the hood) or any other unique index [font="Arial Black"]BUT [/font] then you lose some of the advantages of having a partitioned table making it almost not worth while.

    Not adding the partitioning column to PK/Unique indexes means that the indexes will not be "aligned". That not only means that you can't use SWITCH to archive older data, it can also mean that the index isn't partitioned unless you partition it separately (non-aligned unless you add the partitioning column to it for unique indexes. If you don't, you still won't be able to use SWITCH in such a case). If the index isn't partitioned, then you don't enjoy the benefits of lighter index maintenance periods. Of course, if the clustered index isn't partitioned, neither is your table.

    SWITCH may not be important, though. You generally only need that if you want to very quickly SWITCH in millions of rows from a staging table to the main table or if you want to very quickly SWITCH out older data into a separate table and drop the table (takes milliseconds).

    So there is a requirement to add the partitioning column to PK/UNIQUE indexes but only if you want SWITCH to be available.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Fredy James (4/7/2014)


    I know this is an old post, but completeness of this thread I would like to add the below comments.

    DBA as well as Jeff.Mason was partially correct in the sense that normally by default primary keys are clustered. Eventhough it is not a requirement for a partition column NOT to be in the primary key, but best practice is for a partition column to be part of the clustered index. Without this, i.e. partition key not in the clustered index, and since in SQL Server, all indexes and data sitting on top of clustered index, SQL Engine will always perform a table-scan and not partition scan and hence creating a partition is not affective at all.

    Cheers,

    Fredy

    Just a couple of additional notes there...

    If the index supports a query in such a fashion as to do an index seek (and a possible range scan for multi-row queries), it won't do a table scan any more than it would in a monolithic table. And, IIRC, the partition scans are only possible if you include the partitioning column in the criteria of the query. Since partition scans are really like smaller table scans, they'll still be a lot less effective than good code that does index seeks. Performance isn't necessarily the primary reason that you'd want to partition a table because such performance has a pretty narrow scope of queries.

    The other thing is that you may have to consider whether or not the PK is going to be a target of an FK. If it is, you can't add the partitioning column to it. You'll have to live with a non-aligned unique index.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply