Merge/Split/Switch partition's affect on indexes?

Question

Merge/Split/Switch partition's affect on indexes?

SystemlordAnubis

Old Hand

Points: 364
More actions
May 24, 2019 at 1:35 am

#3646742

Hi All,
Currently I have a partitioned table that is partitioned by date. On this table, I have 3 indexes that are not partitioned, each in their own filegroup.
I'm now wanting to archive earlier portions of this partitioned data into another table of similar structure; when executing a Merge/Split or a Switch style archive method - how would this affect the indexes that are not partitioned along with the main data?
Thanks
Anubis

Viewing 12 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply

frederico_fonseca SSCoach Points: 16225 More actions · Answer 1

in order to do the switch you need to drop the non partitioned indexes and recreate them afterwards

see https://docs.microsoft.com/en-us/sql/t-sql/statements/alter-table-transact-sql?view=sql-server-2017 and go to the switch partition description

SystemlordAnubis Old Hand Points: 364 More actions · Answer 2

Hi Frederico,

Thanks for the prompt reply - that's unfortunate as it would create a huge overhead for this table - does the same apply if the indexes exist in the same partition scheme as opposed to being isolated?

Thanks

frederico_fonseca SSCoach Points: 16225 More actions · Answer 3

as per the article if all indexes are on the same PS it will work.

But you need to consider is that some indexes that are non partitioned may have been created because of the possible performance issues that partition tables can have. So while changing them to be partitioned will help moving data out it can cause some queries to perform worst.

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 4

SystemlordAnubis wrote:

Hi Frederico, Thanks for the prompt reply - that's unfortunate as it would create a huge overhead for this table - does the same apply if the indexes exist in the same partition scheme as opposed to being isolated? Thanks

In most cases no (one exception being columnstore indexes in SQL Server 2012). Not only can you retain your partition-aligned indexes, you can apply index maintenance at the partition level too.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Jeff Moden SSC Guru Points: 1004601 More actions · Answer 5

Just curious...

How many rows in this table?
How many GB does the Clustered Index occupy?
How many partitions are there in the table?
Is each partition in a separate file in a separate file group (1 file per partition, 1 file group per file)?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

SystemlordAnubis Old Hand Points: 364 More actions · Answer 6

Hi All,

Thanks for the replies.

I have considered aligning the indexes with the partition scheme but as you mentioned Frederico, this requires tuning of the queries using the indexes to make sure they're not impacting performance - currently investigating this now.

The table in question:

Currently 500,000,000+, growing by 5-7million per day minimum
Data space occupied currently is around 40GB, indexes (3 off) also consuming around 40GB
There is presently around 90-100 partitions (one-per-day).
Yes, one partition per filegroup, one file per filegroup.

The intention with this table is for the data to remain in the "live" table for around 180 days, the 3 indexes provide quick access to needed data - thereafter, the older data is to be archived - moved as quickly/efficiently as possible from the live table, into an archive table that exists on slower/larger storage media.

Thanks

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 7

What version of SQL Server are you using?

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Jeff Moden SSC Guru Points: 1004601 More actions · Answer 8

SystemlordAnubis wrote:

Hi All, Thanks for the replies. I have considered aligning the indexes with the partition scheme but as you mentioned Frederico, this requires tuning of the queries using the indexes to make sure they're not impacting performance - currently investigating this now. The table in question:
Currently 500,000,000+, growing by 5-7million per day minimum
Data space occupied currently is around 40GB, indexes (3 off) also consuming around 40GB
There is presently around 90-100 partitions (one-per-day).
Yes, one partition per filegroup, one file per filegroup.
The intention with this table is for the data to remain in the "live" table for around 180 days, the 3 indexes provide quick access to needed data - thereafter, the older data is to be archived - moved as quickly/efficiently as possible from the live table, into an archive table that exists on slower/larger storage media. Thanks

From the looks of it, the only reason to partition this table is to take advantage of the SWITCH ability to quickly release older data from the table so that it can be archived.

Despite the number of rows involved, this isn't a big table nor is it very wide. That being said, I think that having 1 partition per day is serious overkill. If it were me (I've got stuff like this and my recommendation comes from use), I'd only use monthly partitions. The reduction in partitions might help performance a fair bit for queries that span more than 1 day because of the much more effective B-TREE structures that fewer partitions will provide.

If the 40GB for indexes includes the Clustered Index (which is where the data lives), then your non-clustered indexes (NCI) are extremely small compared to the data. Your non-aligned NCIs should rebuild in a flash and not actually be a performance issue if you want to drop them for the SWITCH out and then rebuild them. It should be especially non-obnoxious if you only have to do a SWITCH out once per month instead of once per day.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

SystemlordAnubis Old Hand Points: 364 More actions · Answer 9

Chris, The version is 2017 enterprise.

Jeff, Thanks again for the detailed response - and it's good to know the info comes from real-world background.

You're correct, the only reason for daily is to reduce the time for switching; however I have two types of queries that primarily hit this data - one is seeking the last day's worth of data, the other query is the last month's worth of data. All queries know the date bounds that they're seeking in addition to the other data columns or filtering results.

The 40GB of index data is in addition to the 40GB of clustered index data (just coincidence they happen to be similar in size). The table contains a bigint identity (as unique record reference) and a datetime2(7) column which forms the primary key. Partitioning occurs on the datetime2(7) column.

The challenges I'm encountering with this environment is that records are bulk-logged (20-50K rows total across different transactions) into the table every few minutes or so; I'm also seeing high index fragmentation which I'm managing on a nightly basis to keep it in check.

After data hits the table, I need to perform some basic processing on the data and a comparison to the previously logged record specific to that rows' source. I'm currently achieving this with bulk update scripts as agent jobs in batches.

Certainly keen to understand new/better ways at handling this type of situation.

Thanks

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 10

If you can batch up those inserts so that they are taking place under your control every 24 hours rather than at random every few minutes or so, your whole playing field will change.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

SystemlordAnubis Old Hand Points: 364 More actions · Answer 11

Hi Chris,

Unfortunately not; it's coming from many field devices and needs to be accessible as soon as it's generated.

Thanks.

Merge/Split/Switch partition's affect on indexes?

Cookies on SQLServerCentral