|
|
|
Grasshopper
      
Group: General Forum Members
Last Login: Wednesday, May 08, 2013 4:30 AM
Points: 20,
Visits: 279
|
|
Hi All,
Just implemented partitioning in SQL Server 2005. Which should be a great tool for better flexibility and performance. However I'm pretty disappointed running into performance problems while running simple queries like these:
select max(id) from tbl_test
If the table contains 50 miljons of rows. An index scan of 50 miljon rows occurs despite the existence of the index. When using an unpartitioned table an index seek is used as one would expect.
Then I found out this bug is also confirmed by Microsoft: http://social.msdn.microsoft.com/forums/en-US/sqldatabaseengine/thread/ac2f13d7-875a-4f0c-b119-c654e4990ea7/
I'm pretty disappointed in Microsoft especially the fact they are not able to solve this problem. I was wondering if some of you also ran into this problem and what workarounds you have used.
Best Regards, Peter (SQL Server DBA)
__________________ DBA freak since 2008
|
|
|
|
|
SSCrazy
      
Group: General Forum Members
Last Login: Tuesday, March 26, 2013 8:41 AM
Points: 2,562,
Visits: 3,451
|
|
peterjonk (3/26/2010) select max(id) from tbl_test first of all this query never go for index seek as it doesnt contain any where clause and second, if this table has only one column (id) then above query should go for index scan and if it has more then one column it will go for table scan for these cases i am assuming that we have non clustered index on id column
-------Bhuvnesh---------- While 1 = 1 (Learning SQL....) Click to get fast response of your post
|
|
|
|
|
Grasshopper
      
Group: General Forum Members
Last Login: Wednesday, May 08, 2013 4:30 AM
Points: 20,
Visits: 279
|
|
Please read the link I posted. A max query does in fact use an index seek if the column is a primary key or indexed. The query I provided was just to use a sample. In the real world the table has a clustered partitioned index. There is huge difference between the execution plans of a partitioned table (Index scan) and an unpartitioned table (Index seek).
__________________ DBA freak since 2008
|
|
|
|
|
SSCrazy Eights
        
Group: General Forum Members
Last Login: Saturday, May 04, 2013 11:13 AM
Points: 9,855,
Visits: 9,374
|
|
Have you tried a query like this?:
SELECT MAX(CAST(LTRIM(CAST(subID AS VARCHAR(MAX))) AS bigint)) AS id FROM ( SELECT $PARTITION.YourPatitionFunction(YourPartitionColumn) AS [partitionNum] , MAX(id) AS [subID] FROM tbl_test GROUP BY $PARTITION.YourPatitionFunction(YourPartitionColumn) ) AS a
I don't have partitioned tables to test this against, so it may have more opaqueness in the outer MAX than it needs.
-- RBarryYoung, (302)375-0451 blog: MovingSQL.com, Twitter: @RBarryYoung Proactive Performance Solutions, Inc. "Performance is our middle name."
|
|
|
|
|
Grasshopper
      
Group: General Forum Members
Last Login: Wednesday, May 08, 2013 4:30 AM
Points: 20,
Visits: 279
|
|
Thanks for you reply and suggestion. It seems that rewriting the queries is the only way to work around this problem. Some of our .NET applications using ad-hoc queries (via OR mappers) or SP's to execute such queries receive time-outs. So we decided to rollback partitioning because we first need to investigate the impact of this problem. That's very time consuming if you have dozens of applications and stored procedures which need to be reviewed for query rewriting.
The problem is confirmed by Microsoft and still exists in SQL 2008
You've provided a very nice analysis that has shown several limitations in the optimizer related to queries on partitioned tables that use MIN and TOP. We're not in a position to fix these issues as a bug fix for SQL Server 2008 because they require some significant effort.
A very disappointing answer from Microsoft in my opinion. This is the workaround I used which includes an extra where clause to limit the number of rows which must be scanned:
declare @numrows int declare @return_value int
-- Get number of rows in table SELECT @numrows=sum(p.rows) FROM sys.partitions p inner join sys.indexes i on p.object_id=i.object_id and p.index_id=i.index_id WHERE p.object_id=OBJECT_ID('TBL_TEST') and name='PK_TBL_TEST_IDENTITY_ID'
-- Get max value of identity column select max(IDENTITY_ID) from TBL_TEST where IDENTITY_ID >= @numrows
__________________ DBA freak since 2008
|
|
|
|
|
SSCrazy Eights
        
Group: General Forum Members
Last Login: Saturday, May 04, 2013 11:13 AM
Points: 9,855,
Visits: 9,374
|
|
|
|
|
|
SSC Eights!
      
Group: General Forum Members
Last Login: Monday, June 04, 2012 5:00 PM
Points: 891,
Visits: 225
|
|
this has been improved in 2008. Seek happens at partition level and not at total table level.[Partition-Aware Seek Operation] [i think this is called as partition pruning]
|
|
|
|
|
SSChampion
        
Group: General Forum Members
Last Login: Wednesday, May 08, 2013 10:33 AM
Points: 10,989,
Visits: 10,529
|
|
vidya_pande (3/27/2010) this has been improved in 2008. Seek happens at partition level and not at total table level.[Partition-Aware Seek Operation] [i think this is called as partition pruning] Doesn't help at all. Read the Connect item linked to previously in this thread.
Paul White SQL Server MVP SQLblog.com @SQL_Kiwi
|
|
|
|
|
SSChampion
        
Group: General Forum Members
Last Login: Wednesday, May 08, 2013 10:33 AM
Points: 10,989,
Visits: 10,529
|
|
This is an annoying issue. I used to work around this by creating an extra non-partitioned index on the partitioned table.
The extra index had to be dropped before, and re-created after, partition SWITCHing, of course - but partitions typically aren't switched around very frequently, and usually during a maintenance window.
Slightly inconvenient, but worth it for the other benefits of partitioning.
Paul White SQL Server MVP SQLblog.com @SQL_Kiwi
|
|
|
|