SQL Clone
SQLServerCentral is supported by Redgate
Log in  ::  Register  ::  Not logged in

Partitioning - Part 4

By Andy Warren,

So far I've covered the basics of partitioning strategies in Part 1, archiving in Part 2, and partitioned views in Part 3. This time we'll finish up by looking at the partitioning features now available in the Enterprise edition of SQL 2005.

I'm going to start with deploying a simple partition. It's a three step process:

  • Create a partitioning function
  • Create a partitioning scheme
  • Create the table using the partitioning scheme

For a simple example we're going to partition the Person.Contact table in Adventureworks by year. We'll start by creating the partitioning function:

CREATE PARTITION FUNCTION fnPartitionByYear (datetime)

In this case we're creating our 'bucketing' mechanism. We want to create partitions for each year. One slightly confusing part is whether to use LEFT or RIGHT. Essentially you're telling the function whether your values are the last ones in each bucket (LEFT), or the first one in each bucket (RIGHT). I like RIGHT for dates because I want 1/1/1996 12/31/1996 23:59 in the same partition and it's easier than using LEFT with 12/31/1996 23:59:59, 12/31/1997 23:59, etc. SQL will apply this function to each row to figure out what partition it belongs to when you load (or modify) data.

Now we'll create a partitioning scheme using the function we just created. A scheme is a super container that can hold one or more filegroups. I've used the simpler ALL syntax here to show that I want to create all the partitions on Primary. Note that because primary is a reserved word it requires the brackets. Putting all on one filegroup isn't as good as using multiple filegroups across different drives, but on my laptop I have only one drive and I wanted to illustrate the ALL usage. It is possible to specify a filegroup for each partition and even to specify what filegroup will be used when the next partition is created.

AS PARTITION fnPartitionByYear
ALL to ([Primary])

We should see this message:

Finally, we'll create a new table called ContactPartitioned that is almost the same as the original table. One of the rules is that partitioning columns have to be a part of any unique index (as I understand it, this helps the optimizer know that it only needs to check a single partition) if you want that index to be on the same partition. So our choices based on the design of the original Contact table are to append the modifieddate to the the previously defined clustered index so that everything else stays the same, or we change the primary key to be non clustered and create it directly on a filegroup. I know it's a little confusing! It's absolutely worth spending a few minutes using my code samples to work through all of this so you can see first hand. The "magic" is the last line where instead of creating the table on FILEGROUP, we create it on SCHEME. In this case, our just created ContactScheme and we pass in the partitioning column of modified date.

CREATE TABLE [Person].[ContactPartitioned](
	[NameStyle] [dbo].[NameStyle] NOT NULL   DEFAULT ((0)),
	[Title] [nvarchar](8) NULL,
	[FirstName] [dbo].[Name] NOT NULL,
	[MiddleName] [dbo].[Name] NULL,
	[LastName] [dbo].[Name] NOT NULL,
	[Suffix] [nvarchar](10) NULL,
	[EmailAddress] [nvarchar](50) NULL,
	[EmailPromotion] [int] NOT NULL  DEFAULT ((0)),
	[Phone] [dbo].[Phone] NULL,
	[PasswordHash] [varchar](40) NOT NULL,
	[PasswordSalt] [varchar](10) NOT NULL,
	[AdditionalContactInfo] [xml](CONTENT [Person].[AdditionalContactInfoSchemaCollection]) NULL,
	[rowguid] [uniqueidentifier] ROWGUIDCOL  NOT NULL   DEFAULT (newid()),
	[ModifiedDate] [datetime] NOT NULL  DEFAULT (getdate())
 CONSTRAINT [PK_Contact_ContactIDNew] PRIMARY KEY CLUSTERED ([ContactID], ModifiedDate ASC))
 ON ContactScheme(modifieddate)

To see if everything is really working, we'll copy the data from Person.contact into the new table:

INSERT INTO Person.ContactPartitioned
from person.contact

Let's run two simple queries so that we can compare apples to apples:

select  * from person.contactpartitioned where modifieddate = '1/1/2004'
select  * from person.contact where modifieddate = '1/1/2004'

The both return the same number of rows, but if just look at the plans we see that there is a big difference in the relative query costs:

We can see that in both cases we used a clustered index scan (because we elected to return all columns and because we have no index on modifieddate in either table), but our query against our new partitioned table seems to be much more efficient. We can confirm that with a quick Profiler session:

Another way to confirm that partitioning is actually being used is to hover over the plan operator and see if it shows a partition id line (in red below):

To see how the data was allocated, we can run this query:

select object_name(object_id), partition_number, rows from sys.partitions

If we query the data by year directly, we get:

select year(modifieddate), count(*) as PartitionCount
from person.contactpartitioned
group by year(modifieddate)
order by year(modifieddate)

We only get 10 rows. Why? The person.contact table contains no data for 2006 or 2007, but we created partitions for those year in our partitioning function. That accounts for 2 of the missing 3. The other one is the first partition with 0 rows, created to handle any data that might be prior to 1/1/1996.

Here are a few miscellaneous notes about partitioning:

  • Remember its only Enterprise edition (and Developer of course)
  • Indexes can be partitioned to match the table partition, or not. Just as partitioning a table can boost speed in certain cases, same is true for indexes
  • It's horizontal only
  • There's a lot of options and rules to partitioning and no UI to help you out. Before you go big, build a small table with some test data that matches your scenario and work through the steps

I blog once a week or so at http://blogs.sqlservercentral.com/blogs/andy_warren/default.aspx about SQL Server, SQL user groups, and related topics. I hope you'll visit and comment occasionally!

Total article views: 7106 | Views in the last 30 days: 40
Related Articles

creating filegroup

creating filegroup


Partitions and Filegroups

Sliding-Window Partitions and their usage of Filegroups when Switching In/Out





Altering Partition Scheme and Function

How to Alter Partition Scheme and Function


Partition Function to an existing table

What's the best partition function?