Partitioning on an existing large production table - how to decide?

  • Hello,

    I have a large production table:

    - 64 columns

    - around 800 millions of rows

    - around 80 GB of data + index (18 GB for the index size)

    with the following flow:

    - around 1 million records deleted every day

    - around 900.000 records inserted every day

    - a query that will update another table with the modifications made every day (insertes & deleted) - based on a datetime column (indexation_day)

    - some other small queries not very relevant

    There is a maintenance plan that treats the fragmented indexes and the statistics that are not updated, but this maintenance plan also works very slow because of the large amount of data in the table and also large amount of modifications made every day.

    I was thinking of partitioning the table, based on the datetime field(indexation_day) and wondering what would be the the impact on performance.

    Some of my questions questions are:

    - should i make the datetime field (on which the biggest query is based on) part of my clustered index (which is an int, an id at this point)? the next records inserted into the datetime field will be in ascending order.

    - what will be the impact on maintenance on the partitions?

    - should i split the partitions on different filegroups ? (read some articles how it would not be very good on performance)

    - how can i use the SWITCH operator to allow extremely fast loading and removal of large amounts of data?

    I have read on a website that 'Prescribing table partitioning is like recommending significant surgery'. I hope my questions are accurate, but my main issue is the partitioning affect on my existing (slow) performance and on the existing (slow) maintenance plan. Any thoughts on this will be extremely welcomed!

    Thank you for reading a big posts and thank you in advance for every tip!

  • If you're partitioning solely for performance, don't waste your time.

    https://www.simple-talk.com/sql/database-administration/gail-shaws-sql-server-howlers/

    If you're partitioning for fast data loads and easier maintenance, then the table's insert patterns and the business rules for archiving/deleting data will be your guide as to what you're partitioning on

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thank you for your reply!

    My situation is that i want to find the perfect balance between performance and data loads and maintenance.

    For the query performance issue, i was thinking that if the query is made based on the filter indexation_day, which is the partitioning key, it should work better with the partitioned table, than the case where the table is not partitioned and i only have a non-clustered index on indexation_day.

    The reason I presented my table and the every day flow, was also to find out some views on this issue.

    Thank you!

  • Partitioning is not primarily for performance, have a read through the article I referenced.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply