Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12345»»»

Improving Performance of Cross-Partition Queries Expand / Collapse
Author
Message
Posted Monday, May 3, 2010 10:28 PM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Yesterday @ 6:36 PM
Points: 370, Visits: 1,970
Comments posted to this topic are about the item Improving Performance of Cross-Partition Queries




SQL# - http://www.SQLsharp.com/
Post #915063
Posted Tuesday, May 4, 2010 2:40 AM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Tuesday, November 6, 2012 3:58 AM
Points: 94, Visits: 457
Have you tried changing the order of the clustered index. You say you did try this index:

(IntDate ASC, CustomerID ASC, PartitionID ASC) ON[PartitionedTables](PartitionID)

But have you tried putting the PartitionID fist? :
(PartitionID ASC, IntDate ASC, CustomerID ASC) ON[PartitionedTables](PartitionID)

I'm very curious what happens when you try this, so please post the result....
Post #915141
Posted Tuesday, May 4, 2010 5:58 AM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Today @ 6:23 AM
Points: 9,928, Visits: 11,196
For SQL Server magazine subscribers, Itzik Ben-Gan has an article on exactly this issue at:
http://www.sqlmag.com/article/microsoft-products/max-and-min-aggregates-against-partitioned-tables.aspx
That was published March 16 2010.

Anyway, about this article. I'm afraid I did not rate it highly for many reasons. One shortcoming is the lack of easily consumable sample code either in-line with the text or as a download.

It would also have been good to mention that a non-partitioned index on a partitioned table does not suffer from this optimiser limitation:

-- Everyone has one of these, feel free to use another test database
USE tempdb;
GO
-- Example partition function and scheme
CREATE PARTITION FUNCTION PF(TINYINT) AS RANGE RIGHT FOR VALUES (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
CREATE PARTITION SCHEME PS AS PARTITION PF ALL TO ([PRIMARY]);
GO
-- Test table (partitioned heap)
CREATE TABLE dbo.ExampleTable
(
[ExampleTableID] [bigint] NOT NULL IDENTITY(1, 1),
[CustomerID] [int] NOT NULL,
[PartitionID] [tinyint] NOT NULL,
[IntDate] [int] NOT NULL,
)
ON PS (PartitionID);
GO
-- Partitioned clustered index
-- (note: clustering key should always be unique)
CREATE CLUSTERED INDEX [PCX dbo.ExampleTable IntDate, CustomerID]
ON dbo.ExampleTable
(IntDate, CustomerID)
ON PS (PartitionID);
GO
-- Add 100,000 randomish rows
INSERT dbo.ExampleTable WITH (TABLOCK)
(CustomerID, PartitionID, IntDate)
SELECT TOP (100000)
RAND(CHECKSUM(NEWID())) * 1000 + 1,
RAND(CHECKSUM(NEWID())) * 10 + 1,
CONVERT(INTEGER, CONVERT(CHAR(8), DATEADD(DAY, 0 - RAND(CHECKSUM(NEWID())) * 365 + 1, GETDATE()), 112))
FROM master.sys.columns C1,
master.sys.columns C2,
master.sys.columns C3;
GO
-- Add partitioned primary key
ALTER TABLE dbo.ExampleTable
ADD CONSTRAINT [PK dbo.ExampleTable ExampleTableID, CustomerID, PartitionID]
PRIMARY KEY NONCLUSTERED (ExampleTableID, CustomerID, PartitionID) ON PS (PartitionID);
GO
-- Add partitioned index on CustomerID
CREATE NONCLUSTERED INDEX [PIX dbo.ExampleTable CustomerID]
ON dbo.ExampleTable (CustomerID)
ON PS (PartitionID);
GO
-- Add non-partitioned index on IntDate
CREATE NONCLUSTERED INDEX [IX dbo.ExampleTable IntDate]
ON dbo.ExampleTable (IntDate)
ON [PRIMARY];
GO
-- +++++++++++++++++++++++++++++++++++++++++++++
-- Efficient plan based on non-partitioned index
-- +++++++++++++++++++++++++++++++++++++++++++++
SELECT minimum_date = MIN(IntDate)
FROM dbo.ExampleTable;

-- ++++++++++++++++++++++++++++++++++++++++++++++
-- Alternative method to create an efficient plan
-- based partly on Itzik Ben-Gan's approach
-- +++++++++++++++++++++++++++++++++++++++++++++
SELECT min_customer_id = MIN(iTVF.partition_minimum)
FROM sys.partitions AS P
CROSS
APPLY (
-- Minimum per partition
SELECT partition_minimum = MIN(ET.CustomerID)
FROM dbo.ExampleTable ET
WHERE $PARTITION.PF(ET.PartitionID) = P.partition_number
) iTVF
WHERE P.[object_id] = OBJECT_ID(N'dbo.ExampleTable', N'U')
AND P.index_id IN (0, 1);
GO
-- Tidy up
DROP TABLE dbo.ExampleTable;
DROP PARTITION SCHEME PS;
DROP PARTITION FUNCTION PF;

Of course it is fair to say that the non-aligned index would have to be dropped and re-created to perform SWITCH operations. In mitigation, I would just say that this likely occurs in a maintenance window anyway, and indexes are pretty quick to build. One other option is to maintain both partitioned and non-partitioned indexes - the optimiser is smart enough to choose the efficient plan based on the non-partitioned index for MIN and MAX queries.

Paul




Paul White
SQL Server MVP
SQLblog.com
@SQL_Kiwi
Post #915211
Posted Tuesday, May 4, 2010 7:49 AM


SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Today @ 2:37 PM
Points: 4,438, Visits: 6,343
Numerous SQL Server MVPs (and lots of other users as well) have been screaming at Microsoft for years to improve partitioning. I hope improvements will be coming in later versions of the engine. Connect has some entries for these things I think. It is a sin we need to jump through hoops for things like this!!

Best,

Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru at GMail
Post #915311
Posted Tuesday, May 4, 2010 9:08 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Monday, November 10, 2014 9:36 AM
Points: 1,414, Visits: 4,544
nice article

we have a 2TB database we're going to be moving to new hardware soon and i wanted to try to partition it because people seem to like to select an entire year's worth of data or more in one query. but there is no way to test it since we don't have 2TB of empty space anywhere


https://plus.google.com/100125998302068852885/posts?hl=en
http://twitter.com/alent1234
x-box live gamertag: i am null
[url=http://live.xbox.com/en-US/MyXbox/Profile[/url]
Post #915387
Posted Tuesday, May 4, 2010 9:43 AM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Yesterday @ 6:36 PM
Points: 370, Visits: 1,970
Sander A. (5/4/2010)
But have you tried putting the PartitionID fist? :
(PartitionID ASC, IntDate ASC, CustomerID ASC) ON[PartitionedTables](PartitionID)


Hello Sander. I did just try that with the same table and data that I used before and it performed basically the same as with the PartitionID at the end. Essentially 10 - 11 seconds for the generic/simple MIN(IntDate) query. The row order should be the same in either case since the values for PartitionID can only be of one value per partition.

Take care,
Solomon...





SQL# - http://www.SQLsharp.com/
Post #915420
Posted Tuesday, May 4, 2010 9:54 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Friday, September 5, 2014 2:00 PM
Points: 2,160, Visits: 2,191
What happens if you use a sub query to get the minimum value for each partition, and then get the min of that?

i.e. Something like this:

SELECT MIN(MinIntDate)
FROM (SELECT PartitionID, MIN(IntDate) MinIntDate
FROM dbo.ExampleTablePartitioned
GROUP BY PartitionID) a

or

SELECT MIN(MinIntDate) 
FROM (SELECT PartitionID FROM dbo.ExampleTable GROUP BY PartitionID) a
CROSS APPLY (SELECT MIN(IntDate) MinIntDate
FROM dbo.ExampleTable b
WHERE B.PartitionID = a.PartitionID) c

Post #915429
Posted Tuesday, May 4, 2010 9:56 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Tuesday, November 11, 2014 11:16 PM
Points: 49, Visits: 267
Hi alen,
Be careful before you decide to employ partitioning here. Do whatever you can do test your queries before you decide to implement. It could be a life-saver. :)

I agree w/ you SQL Guru...I've spoke w/ the partitioning team @ MS. The idea for partitioning is not that it makes queries faster (believe it or not), but that it eases moving data in and out and along with some maintenance advantages.

Tim
SQL Server MVP
Post #915432
Posted Tuesday, May 4, 2010 9:56 AM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Yesterday @ 6:36 PM
Points: 370, Visits: 1,970
Paul White NZ (5/4/2010)
It would also have been good to mention that a non-partitioned index on a partitioned table does not suffer from this optimiser limitation


Paul, thanks for the feedback. Interesting about the Ben-Gan article; I did not know about it.

And, for the sake of being thorough, you make a good point about the non-partitioned index. However, it is not that applicable in my situation. The index comes with two costs: disk space (in this case over 1 GB!) and it needs to be maintained across INSERT and DELETE operations. Given that the piece of code in my article is in about five Stored Procs, that is negligible disk space and still fairly maintainable. True, it does not help ad-hoc queries but we generally don't do those in Production, and if we do then rarely would they be across all partitions. I am sure others might find themselves in a similar situation.

Take care,
Solomon...





SQL# - http://www.SQLsharp.com/
Post #915433
Posted Tuesday, May 4, 2010 10:00 AM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Yesterday @ 6:36 PM
Points: 370, Visits: 1,970
alen teplitsky (5/4/2010)
nice article

we have a 2TB database we're going to be moving to new hardware soon ... but there is no way to test it since we don't have 2TB of empty space anywhere


Hello Alen and thanks. I am surprised that you cannot find 2 TB of available space given that I keep hearing how "disk is cheap" . Good luck!

Take care,
Solomon...





SQL# - http://www.SQLsharp.com/
Post #915436
« Prev Topic | Next Topic »

Add to briefcase 12345»»»

Permissions Expand / Collapse