Can you create a partition on an existing Fact table?

Question

Can you create a partition on an existing Fact table?

skaggs.andrew

SSCrazy

Points: 2474
More actions
May 14, 2014 at 1:08 pm

#285507

Hi All,
I have a Fact table that has been in use for 2 years now and I am getting to the point where the table is beginning to get big. It is currently sitting at roughly 40 mill records. I have BO reports that are built of this table and I have an ETL nightly scheduler that repopulates this table every night. (Along with other staging, dim, etc.)
What I am looking to do is improve the performance of recreating this fact table. The fact table has about 6 years worth of data already in it (education data). What I want to do (because I THINK this would help in improving performance) is only delete the current records and then load in the current records each night leaving all historical data untouched in this fact table. (If it's not changing, why delete and reload every night?)
When testing this process, I found that trying to delete 5 mill records from this table was taking me an hour to do so. After researching a little, I felt like maybe building partitions on this fact table would greatly help me achieve what I want to do. The fact table is already kind of broken into chunks with State Tests, Course Grades, Attendance, Incidents, etc.
Based on this, can I create partitions on this already existing FACT table or would I have to rebuild an all new Fact table with partitions, then transfer over the data?
Please let me know your opinions.
Thanks for the help.

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 1

You can partition an existing table. Not by adding partitions to it, but by rebuilding the clustered index onto a partition scheme. This will take as long as rebuilding the clustered index usually takes, it won't be instant.

Once partitioned, you can switch out a partition of data (swap a partition with an empty table) to do fast 'deletes'

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

skaggs.andrew SSCrazy Points: 2474 More actions · Answer 2

GilaMonster (5/14/2014)
You can partition an existing table. Not by adding partitions to it, but by rebuilding the clustered index onto a partition scheme. This will take as long as rebuilding the clustered index usually takes, it won't be instant.
Once partitioned, you can switch out a partition of data (swap a partition with an empty table) to do fast 'deletes'

Thanks for the response GilaMonster. Would you by any chance know of any article that could help me through this process? I have read about creating the partition function and scheme.

When rebuilding the clustered index, I can just delete it and rebuild it so it is pointing to the newly created partition scheme?

As for switching out a partition of data (which sounds like that is what I am ultimately looking for) I dont know anything about this. Any info would be greatly appreciated.

Thanks

skaggs.andrew SSCrazy Points: 2474 More actions · Answer 3

Well, my dreams may have been crushed. I tried to create a Partition Function and when I hit the execute button I got the following message:

Partition function can only be created in Enterprise edition of SQL Server. Only Enterprise edition of SQL Server supports partitioning.

Is there any work around for this?

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 4

skaggs.andrew (5/14/2014)
Thanks for the response GilaMonster. Would you by any chance know of any article that could help me through this process? I have read about creating the partition function and scheme.

Start with Books Online, there's a lot in there

When rebuilding the clustered index, I can just delete it and rebuild it so it is pointing to the newly created partition scheme?

No, you don't want to do that. Just do a CREATE ... WITH DROP_EXISTING and specify the partition scheme where the filegroup would usually go

As for switching out a partition of data (which sounds like that is what I am ultimately looking for) I dont know anything about this. Any info would be greatly appreciated.

Again, Books Online has a lot on the subject.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

stevefromOZ SSC-Forever Points: 43646 More actions · Answer 5

Poor man's partitioning (though, this was the way we all used ot have to do it 😀 ) -> Partitioned Views. Details from msft here.

Steve.

skaggs.andrew SSCrazy Points: 2474 More actions · Answer 6

stevefromOZ (5/14/2014)
Poor man's partitioning (though, this was the way we all used ot have to do it 😀 ) -> Partitioned Views. Details from msft here.

Is this essentially physically creating smaller tables then union them all together in a view? What are the down sides to doing it this way vs having the capability of using partitioning?

Would it be better to try and get the Enterprise Edition?

Jeff Moden SSC Guru Points: 1003853 More actions · Answer 7

skaggs.andrew (5/14/2014)
Hi All,
I have a Fact table that has been in use for 2 years now and I am getting to the point where the table is beginning to get big. It is currently sitting at roughly 40 mill records. I have BO reports that are built of this table and I have an ETL nightly scheduler that repopulates this table every night. (Along with other staging, dim, etc.)
What I am looking to do is improve the performance of recreating this fact table. The fact table has about 6 years worth of data already in it (education data). What I want to do (because I THINK this would help in improving performance) is only delete the current records and then load in the current records each night leaving all historical data untouched in this fact table. (If it's not changing, why delete and reload every night?)
When testing this process, I found that trying to delete 5 mill records from this table was taking me an hour to do so. After researching a little, I felt like maybe building partitions on this fact table would greatly help me achieve what I want to do. The fact table is already kind of broken into chunks with State Tests, Course Grades, Attendance, Incidents, etc.
Based on this, can I create partitions on this already existing FACT table or would I have to rebuild an all new Fact table with partitions, then transfer over the data?
Please let me know your opinions.
Thanks for the help.

Yes. Have a look at "Partitioned Views" in Books Online. They have a different set of caveats but can be just as effective as "Partitioned Tables".

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

skaggs.andrew SSCrazy Points: 2474 More actions · Answer 8

Thanks Jeff. I will read up on this. Sounds like my only option right now and would be better than using just this stand alone table. Thanks all

stevefromOZ SSC-Forever Points: 43646 More actions · Answer 9

It's not show stopping but the management is a bit of a PITA. You would be well served to create some management procs that do things like dropping and recreating the view as/when things change (ie not as easy as just partition switching in/out - you can't just take a referenced table out from under the view).

Check out what your price would be for Enterprise, but if you're "off the street" without any pricing/discounts at your disposal, the price differential is going to be pretty large.

Steve.

skaggs.andrew SSCrazy Points: 2474 More actions · Answer 10

Thanks Steve. Last question. Let's say partitioning is not an option and I don't want to use the partition view, are there any other things I can do to the Fact table to help me out?

stevefromOZ SSC-Forever Points: 43646 More actions · Answer 11

Is the table keyed or have other indexes? Have a surrogate key? If you have some of these, you might be able to get your deletes to run faster but (and it's a BIG BUT) this would also depend a lot on your set-up - as in, *why* are you having to rebuild/reload the entire table every load? The reasons behind this may limit what your options are. 40MM rows isn't 'huge' but loading it every day could get to be a pain.

Steve.

skaggs.andrew SSCrazy Points: 2474 More actions · Answer 12

As of now, there are several nonclustered indices placed on the fact table. (these were built to help improve the querying of the BO reports) We do have several surrogate keys in the table as well. We rebuild this fact table (along with other dim, staging, etc.) every night becuase the data in these source tables are updated on a daily basis. So in order to show the updated data in the BO reports, we do this.

The entire process of deleting records, dropping indices, loading data, rebuilding indices takes roughly 3 1/2 hours each night. I would really like to try to cut this in half.