A Tale of Partition Schemes and Disappearing Dragons

Question

Post reply

A Tale of Partition Schemes and Disappearing Dragons

porter.james

Mr or Mrs. 500

Points: 590
More actions
November 1, 2015 at 11:03 pm

#309363

Comments posted to this topic are about the item A Tale of Partition Schemes and Disappearing Dragons

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply

akljfhnlaflkj SSC Guru Points: 76202 More actions · Answer 1

akljfhnlaflkj

SSC Guru

Points: 76202

November 2, 2015 at 6:34 am

#1837042

Interesting tale of dragons.

Jeff Moden SSC Guru Points: 1004701 More actions · Answer 2

I loved the implications of being "fired". 🙂

Great article that explains one of the better reasons to partition a table and well done on saving yourself by keeping the old data until you absolutely know the nightly process completed successfully. It helps you "urn" 😛 your pay according to your boss.

I do have a question, though. Do you find that the "nonclustered columnstore index" actually helps performance that much or could you actually do without it with a couple of tweaks to the other indexing on the table?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

nicolas_johnson 52979 SSC Rookie Points: 44 More actions · Answer 3

Another approach is to use views to alternate source tables by updating the view at the end of the load procedure to point at the last updated table. What would be some pros/cons to using the written partition approach vs views?

g.britton SSChampion Points: 13863 More actions · Answer 4

nicolas_johnson 52979 (11/2/2015)
Another approach is to use views to alternate source tables by updating the view at the end of the load procedure to point at the last updated table. What would be some pros/cons to using the written partition approach vs views?

Simplest reasons: partitioned views are on the way out, and a partitioned table gives you fewer objects to manage.

Gerald Britton, Pluralsight courses

bsclyde Say Hey Kid Points: 683 More actions · Answer 5

Excellent post! It was informative and entertaining. Thank you!

Jeff Moden SSC Guru Points: 1004701 More actions · Answer 6

g.britton (11/2/2015)
nicolas_johnson 52979 (11/2/2015)
Another approach is to use views to alternate source tables by updating the view at the end of the load procedure to point at the last updated table. What would be some pros/cons to using the written partition approach vs views?
Simplest reasons: partitioned views are on the way out, and a partitioned table gives you fewer objects to manage.

Ah... don't write off Partitioned Views just because of non-descriptive paragraph that represents a bloated Microsoft opinion to try to drive more sales to the more expensive Enterprise Edition. Partitioned views will be around as long as the Standard Edition is around and MS doesn't see fit to update it for Partitioned Tables.

I'll also say that there are advantages to Partitioned View that many haven't had the need to consider. For example, I can index the current month partition in Partitioned Views differently than older months without having to worry about such things as whether or not an index is aligned or not, especially when the older months have had their file groups set to Read-Only. Also, as Kimberly Tripp has pointed out in her MCM video series, it's sometimes beneficial to have both types of partitioning associated with a "single" logical table.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Constant Coder SSC Rookie Points: 45 More actions · Answer 7

We had a similar need to do a quick switch between a staging table and a production table, but simply used three table renames to accomplish this. The renames were very fast. I am not sure if there would be an advantage to using the partition switching technique instead.

Jeff Moden SSC Guru Points: 1004701 More actions · Answer 8

Constant Coder (11/2/2015)
We had a similar need to do a quick switch between a staging table and a production table, but simply used three table renames to accomplish this. The renames were very fast. I am not sure if there would be an advantage to using the partition switching technique instead.

Good point. The table partition switching isn't necessary (and it's a bit of a PITA too boot) if you're replacing the whole table and your system can tolerate the incredibly brief period of the renames. It did that on an SQL based IVR system with 1.2 million customers and the renames are indeed fast and very effective.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

g.britton SSChampion Points: 13863 More actions · Answer 9

synonyms are also a nice tool, and you can avoid renames

Gerald Britton, Pluralsight courses

joeroshan SSChampion Points: 10377 More actions · Answer 10

entertaining and informative. I have used rename and partition switch. I think rename is good if its a one time load. Partition switch is something that can be created as a solution for production jobs. great article

----------------------------------------------------------------------------------------------------------------------------------------------------
Roshan Joe

Jeff Moden -Forum Etiquette: How to post data/code on a forum to get the best help[/url]

Misha_SQL SSCertifiable Points: 5407 More actions · Answer 11

One of a very few times I found myself smiling while reading technical article. Good information. Thank you!

diego.perdices SSCommitted Points: 1701 More actions · Answer 12

diego.perdices

SSCommitted

Points: 1701

October 20, 2017 at 12:19 am

#1964452

Happy dragons!. Nice article.

Chris Harshman SSC-Forever Points: 42192 More actions · Answer 13

This was a nice demonstration of using partition switching to solve the problem of "loosing" the previous version of the data. I suppose this solution would make a bit more sense if instead of doing a TRUNCATE they were only removing and rebuilding say the current month or year of data instead of truncating the entire table.

I think the real thing this story highlights though is that data cleansing IS THE MOST IMPORTANT and often least respected part of building a data warehouse: "To make a long story short, it turns out one of our suppliers fed us an invalid date value, which we failed to handle correctly, causing the ETL to crash." This kind of situation is usually handled by having an "Unknown" value in your dimension table, so that you always have something to link to your fact record, and then have some process, even if it is manual, to clean up the "Unknown" records later.

Of course, if you want to improve the life expectancy of your dragons, don't take them North of the wall where they could be easily attacked by the white walkers.

Renato10 SSC Enthusiast Points: 198 More actions · Answer 14

To achieve a high availability of the Data Warehouse on functional level there is the concept of StagingArea (the kitchen / workarea to process the source data into Star Data models) and DataMart (the presentation layer). The DataMart is updated after processing the source data in the StagingArea, which will cause a limited unavailability of a Fact during the synchronization activity