SQLServerCentral Article

Speed of Partitioned Views

,

Introduction

There has been some talk about the use of partitioned views in SQL Server 2000 to gain performance improvements and for scale-out of applications.

I am writing this article to put forward my findings in respect of the speed of partitioned views. I hope it will generate more discussion and will give some an insight as to where partitioned views can be used.

What is it about?

There is a Whitepaper on the Microsoft website entitled "Microsoft SQL Server MegaServers: Achieving Software Scale-Out" that describes the throughput of transactions for various configurations in use. It is worth reading.

Also note that there is a difference between a partitioned view (data is all on one server) and a distributed partitioned view (where the data is divided between SQL Servers that are ‘Federated’).

This gives the impression that using partitioned views will give performance improvements.

For this article, I am going to restrict the discussion to partitioned views with all data on one SQL Server (Local Partitioned View).

Before going further I will point out another excellent piece of information. In the book Microsoft SQL Server 2000 Resource Kit there is a section on Data Partitioning Components that states:

If you are attempting to alleviate disk contention, the best approach is to solve this as a standard disk I/O problem. Partitioning data on the same server does not provide noticeable performance gains, and most likely you would be better served to use a single table with good indexes that contain all the data.

This appears to be a contradiction. How can partitioned views achieve performance improvements but a single table with good indexes works better than the view?

What I have found is there is a trade off and some gotchas to using a partitioned view.

We looked at partitioned views as a possible solution to archiving old data (with simple recovery method in case we suddenly need it back) while providing a path to Federated SQL Servers in future.

Measuring the speed for common queries.

First up, take a look at the table schema:

CREATE TABLE [dbo].[tblCRD] (
       [int_detID]          [int] NOT NULL ,
       [tint_RecType]       [tinyint] NOT NULL ,
       [tint_IsCh]          [tinyint] NOT NULL ,
       [int_OLID]           [int] NOT NULL ,
       [int_DLID]           [int] NOT NULL ,
       [int_OrPrefID]       [int] NOT NULL ,
       [int_DesPrefID]      [int] NOT NULL ,
       [sint_OrCtyID]       [smallint] NOT NULL ,
       [sint_DesCtyID]      [smallint] NOT NULL ,
       [tint_OrRegID]       [tinyint] NOT NULL ,
       [tint_DesRegID]      [tinyint] NOT NULL ,
       [sint_IncRteID]      [smallint] NOT NULL ,
       [sint_OutRteID]      [smallint] NOT NULL ,
       [dt_GMTSt]           [datetime] NOT NULL ,
       [dt_LocalSt]         [datetime] NULL ,
       [int_Dr]             [int] NOT NULL ,
       [int_Dist]           [int] NOT NULL ,
       [bit_IsPre]          [bit] NOT NULL ,
       [sint_SceCarrID]     [smallint] NULL ,
       [bit_IsLoc]          [bit] NULL ,
       [int_ClTypeID]       [int] NULL ,
       [tint_PMb]           [tinyint] NULL CONSTRAINT [DF_CRD_PMb] DEFAULT (0),
              CONSTRAINT [PK_tblCRD] PRIMARY KEY  CLUSTERED
              (
                     [int_detID]
              ) WITH  FILLFACTOR = 90  ON [PRIMARY] ,  --Filegroup
              CONSTRAINT [CK_tblCRD_tint_IsCh] CHECK ([tint_IsCh] = 1 or [tint_IsCh] = 0)
) ON [PRIMARY] --Filegroup
GO
 
CREATE  INDEX [IX_tblCRD_dt_GMTSt]
       ON [dbo].[tblCRD]([dt_GMTSt])
       WITH  FILLFACTOR = 98 ON [INDEX] --Filegroup
GO
CREATE  INDEX [IX_tblCRD_sint_SceCarrID]
       ON [dbo].[tblCRD]([sint_SceCarrID])
       WITH  FILLFACTOR = 98 ON [INDEX] --Filegroup
GO
CREATE  INDEX [IX_tblCRD_dt_LocalSt]
       ON [dbo].[tblCRD]([dt_LocalSt])
       WITH  FILLFACTOR = 98 ON [INDEX] --Filegroup
GO

Now this table has approximately 975 million rows and takes up about 120 GB in our production environment.

I have set up a test database for the member tables and partitioned view so I can get comparative performance.

The partitioned view member tables have been designed to take about 32 million rows each. The int_detID field is the partitioning column and has a check constraint for each member table.

Note: All queries for inserts, updates or deletes takes place within Begin Trans/Rollback Trans after hours and after backups complete.

An example of the queries run against the table and view for comparison are as follows:

-- Limit transactions to sizable chunk
SET Rowcount 35000
 
-- get rid of existing global temp table
if exists (
       select * from tempdb.dbo.sysobjects where name = '##tblCRD'
          )
drop table ##tblCRD
GO
 
-- Create Temp Table and fill with rows from original table
SELECT
        [int_detID]        
       ,[tint_RecType]     
       ,[tint_IsCh]        
       ,[int_OLID]         
       ,[int_DLID]         
       ,[int_OrPrefID]     
       ,[int_DesPrefID]    
       ,[sint_OrCtyID]     
       ,[sint_DesCtyID]    
       ,[tint_OrRegID]     
       ,[tint_DesRegID]    
       ,[sint_IncRteID]    
       ,[sint_OutRteID]    
       ,[dt_GMTSt]         
       ,[dt_LocalSt]             
       ,[int_Dr]           
       ,[int_Dist]         
       ,[bit_IsPre]        
       ,[sint_SceCarrID]   
       ,[bit_IsLoc]        
       ,[int_ClTypeID]     
       ,[tint_PMb]
INTO ##tblCRD
FROM [dbo].[tblCRD] (nolock)
order by int_detID Desc
-- Add Index to temp table
CREATE UNIQUE NONCLUSTERED INDEX  NCI_tblCRD ON ##tblCRD(int_detID)
 
-- Showtime
Begin Transaction
Declare  @Duration   Int
       ,@StartTime  datetime
       ,@FinishTime datetime
 
select   @StartTime = Getdate()
       ,@FinishTime  = Getdate()
       ,@Duration = 0
Select 'Start Count ', Getdate()
-- Check temp table has records
SELECT count(*)
FROM [dbo].[tblCRD] CRD, ##tblCRD T
Where T.int_detID = CRD.int_detID
-- Delete from Origin table
Select 'Start Delete ', Getdate()
DELETE [dbo].[tblCRD]
FROM [dbo].[tblCRD] CRD JOIN ##tblCRD T
ON CRD.int_detID = T.int_detID
Where CRD.int_detID >= (Select MIN(T1.int_detID) from ##tblCRD T1)
AND CRD.int_detID <= (Select MAX(T2.int_detID) from ##tblCRD T2)
 
-- Do some processing
Select 'Start Update int_Dr ', Getdate()
Select Top 10 *  from ##tblCRD order by int_detID
 
Update ##tblCRD
Set int_Dr = int_Dr * -1
 
Select Top 10 *  from ##tblCRD order by int_detID
Select 'Before Insert. ',Getdate()
-- Insert back into origin table
INSERT INTO dbo.tblCRD (
        [int_detID]        
       ,[tint_RecType]     
       ,[tint_IsCh]        
       ,[int_OLID]         
       ,[int_DLID]         
       ,[int_OrPrefID]     
       ,[int_DesPrefID]    
       ,[sint_OrCtyID]     
       ,[sint_DesCtyID]    
       ,[tint_OrRegID]     
       ,[tint_DesRegID]    
       ,[sint_IncRteID]    
       ,[sint_OutRteID]    
       ,[dt_GMTSt]         
       ,[dt_LocalSt]             
       ,[int_Dr]           
       ,[int_Dist]         
       ,[bit_IsPre]        
       ,[sint_SceCarrID]   
       ,[bit_IsLoc]        
       ,[int_ClTypeID]     
       ,[tint_PMb]
)
select
        [int_detID]        
       ,[tint_RecType]     
       ,[tint_IsCh]        
       ,[int_OLID]         
       ,[int_DLID]         
       ,[int_OrPrefID]     
       ,[int_DesPrefID]    
       ,[sint_OrCtyID]     
       ,[sint_DesCtyID]    
       ,[tint_OrRegID]     
       ,[tint_DesRegID]    
       ,[sint_IncRteID]    
       ,[sint_OutRteID]    
       ,[dt_GMTSt]         
       ,[dt_LocalSt]             
       ,[int_Dr]           
       ,[int_Dist]         
       ,[bit_IsPre]        
       ,[sint_SceCarrID]   
       ,[bit_IsLoc]        
       ,[int_ClTypeID]     
       ,[tint_PMb]
 from ##tblCRD
 
select   @FinishTime  = Getdate()
       ,@Duration = DATEDIFF(ss, @StartTime, @FinishTime)
 
select   @FinishTime 
Select   @Duration
 
ROLLBACK TRANSACTION

NOTE: This is not a complete list of transactions used for tests. These are an indication only.

The results.

The following table shows the results comparing exactly the same queries:

 

Original Table

Partitioned View

Comment

Insert

13 Seconds

18 Seconds

 

Update *

8 Seconds/7 Sec

4 Seconds/63 Seconds

With and Without Where clause

Delete **

93 Seconds/92 Sec

4 Seconds / 18 minutes

With and Without Where clause

Select ***

15 Seconds/ 5 Sec

2 Seconds/3 min 15 Sec

With and Without Hint

Note: The Update query (following) was run with and without the following where clause. This where clause actually changes the plan chosen by the query optimiser by forcing it to figure out which member tables are needed before running the update.

Update tblCRD
Set tblCRD.int_Dr = tblCRD. int_Dr * -1
From tblCRD inner join ##tblCRD
On tblCRD.int_detID = ##tblCRD.int_detID
Where CRD.int_detID >= (Select MIN(T1.int_detID) from ##tblCRD T1)
AND CRD.int_detID <= (Select MAX(T2.int_detID) from ##tblCRD T2)

** Note: The Delete query used here was using a trick that favours the Partitioned View. If the Delete statement was:

DELETE [dbo].[tblCRD]
FROM [dbo].[tblCRD] CRD JOIN ##tblCRD T
ON CRD.int_detID = T.int_detID

The time for the Partitioned View blows out to 18 minutes! The time for the original table is 92 seconds.

*** Note: Firstly I wanted to prevent re-use of query plans so the queries were varied. Compare only between Original table and Partitioned View.

The 15 second result for Table select versus 2 second result for the view was done using a nolock and index hint on the table select statement and just a nolock hint on the view select.

The other result shows the speed of a different select statement that did not use any hints. Notice that the partitioned view did not perform very well compared to the table.

Conclusion.

Although partitioned views may appear to provide a performance boost, the use of partitioned views should be approached carefully. It is not something that should be implemented without thorough investigation and testing.

The downside to partitioned views are numerous. These include:

Can’t use Identity fields.

Can’t use index hints.

Probably need to rewrite existing queries to re-optimise.

To update partitioned views you need to have SQL Server 2000 Enterprise Edition

(or developer edition.)

All criteria to set up a partitioned view described in BOL must be followed implicitly.

There is a gotcha with Check Constraints on partitioning column: These must be added with Check Option. If disabled/re-enabled then view cannot be updated (refer to KB article Q270013 )

Can’t use text, ntext, image or timestamp fields.

Some queries will run much slower.

The upside to partitioned views can make it worthwhile. These include:

The queries that can restrict what member tables get used by the optimiser can run much faster.

The use of partitioned views allows data to be kept in separate databases that can be backed up separately. This is a lot simpler than using filegroups.

Keeping member tables in a few separate databases makes it easy to migrate to Federated Servers later. Distributed Partitioned Views allow massive scale out!

Archiving and purging old data can be made very easy. Backup database for archiving then truncate table with old data.

Data access to member tables can be balanced across different disk arrays just like filegroups.

Neil Jacobson

Nbj_jobs@hotmail.com

Sydney Australia.

Rate

5 (1)

You rated this post out of 5. Change rating

Share

Share

Rate

5 (1)

You rated this post out of 5. Change rating