Partition design questions for DW fact

  • I can understand the benefits of partitions for a large fact table -  let's say partitioned by month.

    1. If loading daily data we can load to stage partition and switch it in to fact table.
    2. This allows the table be more easily archived for instance 36 partitions only - move the early partition off to archive in an automated way.
    3. It also gives greater back up flexibility - if only 12 months of data can change then only 12 partitions have to be backed up.
    4. Better flexibility for storage performance -  fast/slow disks  
    4. Benefits queries to a degree though partition elimination
    5. Benefits cube loading with the right partition key.

    Not so sure about what happens under the following circumstances :
    1. Hot data across multiple partitions - merge is partition aware hence may not be a problem.  Do partitions need to be designed around hot/changing data?   
    2. Schema changes in the current partition let's say int to bigint or nvarchar(10) to nvarchar(50) - not sure the impact of this with partitions
    3. Schema changes such as new column in the current partition.

    Can anyone explain if 1, 2, 3 i.e. do they offset the benefits of partitions? or can we design partitions around these factors?

  • Are the benefits listed above the reason you are considering partitioning? My take is to really define what you want to gain. We have had developers here chuck weird partitioning schemes on tables in an attempt to get a performance boost. That is not a great reason to implement partitioning. Meaning there are other ways to tune your query performance. You can see gains with partition elimination but generally your queries in the field aren't going to be written to take advantage of partition elimination without some tweaking which could actually slow things down instead of helping. With that said...
    1. You can insert/update into the non-current partition if partitioned by date...is that what you were asking? I might need more info on this one to give a good answer. 
    2. The data type of a column of a partitioned table cannot be changed.
    3. In addition to performing SWITCH operations that involve partitioned tables, ALTER TABLE can be used to change the state of the columns, constraints, and triggers of a partitioned table just like it is used for nonpartitioned tables. However, this statement cannot be used to change the way the table itself is partitioned. To repartition a partitioned table, use ALTER PARTITION SCHEME and ALTER PARTITION FUNCTION. Additionally, you cannot change the data type of a column of a partitioned table.

    Both replies for 2 and 3 were copied and pasted directly from the link below. 
    https://docs.microsoft.com/en-us/sql/t-sql/statements/alter-table-transact-sql?view=sql-server-2017

  • Thanks Daniel.   I'm thinking of partitioning to manage data more so than performance for very large fact tables.  Although as you say performance increase is secondary.  With large tables even archiving rows  becomes a headache with no partitions.
    I appreciate there is a big admin overhead as well as constraints hence important to understand this.  I'm going to experiment with a lab set up.  At the moment though I'm not sure I'd like to live with the constraint of no schema change, having said that, schema changes with non-partitioned tables is expensive too.

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply