• KTD (9/4/2013)


    However the new server creates massive IO and doesn't make use of some indexes that it should be using.

    The first thing to say is that your index on RecordDate does not, in fact, support ordering by RecordDate. This might seem counter-intuitive, but the index is partitioned by ClaimID (just like the base table is).

    The nonclustered index records are in RecordDate key order, but only within each partition. Effectively, the nonclustered index key is (PartitionID, RecordDate) because of the partitioning.

    This explains why you see an expensive sort on 194,656,980 rows even on SQL Server 2005. The partitioned nonclustered index cannot return rows in RecordDate order when more than one partition is involved, so SQL Server has to perform a full sort.

    The situation is logically the same in SQL Server 2005 and SQL Server 2008 R2, but the physical implementation is different. SQL Server 2005 used an APPLY implementation model; it processes one partition at a time in the query plan. The Constant Scan you see in the SQL Server 2005 plan is a list of partition IDs. The Index Scan operation on the inner side of the Nested Loops Join following the Constant Scan operates on a single partition at a time, with the current partition ID as the correlated join parameter.

    One problem with the 2005 implementation was that it did not always distribute work across parallel threads very well. You can see an example of this in the plan you submitted using the @random value. For that particular test run, all the qualifying rows happened to be in a single partition. If you look at the per-thread actual rows, you will see that one thread did all the work. The plan is effectively a serial plan, just with some extra overhead for the additional unused threads and the parallelism operators!

    The physical implementation for partitioned tables completely changed in SQL Server 2008. The APPLY model (Constant Scan + Nested Loops Join, per-partition processing) is no longer used. Instead, the effective index keys (PartitionID, RecordDate) are processed directly, exactly as if that were the real index definiton. You can find more details about the implementation and "enhancements" for SQL Server 2008 in the following TechNet article (please do read this):

    Query Processing Enhancements on Partitioned Tables and Indexes

    Unfortunately, dramatic changes to query processing always mean significant query plan changes. SQL Server 2008 (and all later versions) can no longer use the same query plans for partitioned tables that SQL Server 2005 did, because the internal implementation is so radically different.

    You will see all-new query plans for partitioned table queries when upgrading from SQL Server 2005. Many of these changes will result in improved performance, but some will not, and some will even be much worse. This means that real, detailed testing is needed before upgrading from 2005. As a result of that, you will end up rewriting some partitioned queries to produce good query plans.

    The particular case here is one example of a performance regression. Without an index hint, SQL Server seeks to the partitions and ClaimIDs using the clustered index and then sorts by RecordDate. The clustered index is much, much, wider than the nonclustered index, so you see a huge amount of I/O for this compared with the 2005 plan.

    When the nonclustered index is forced, an unfortunate side-effect results in two sorts being needed. As Erland mentioned, a general-purpose index optimization pre-sorts rows on ClaimID keys for the lookup into the clustered index. The set is then re-sorted on RecordDate later on.

    This is a performance regression, and undoubtedly unfortunate, but the Top operation cannot be performed earlier because the rows are not in RecordDate order before the explicit sort (they are in PartitionID, RecordDate order). The SQL Server 2005 plan just decided to sort on RecordDate alone, and take the expected random-I/O hit when performing lookups into the base table.

    One way to rewrite the test query for better performance is:

    WITH QualifyingRows AS

    (

    SELECT TOP (10000)

    C.ClaimID,

    C.RecordDate

    FROM dbo.Claims AS C

    WHERE

    C.ClaimID > 100781325

    ORDER BY

    C.RecordDate ASC

    )

    SELECT C.*

    FROM dbo.Claims AS C

    WHERE

    EXISTS

    (

    SELECT 1

    FROM QualifyingRows

    WHERE

    QualifyingRows.ClaimID = C.ClaimID

    AND QualifyingRows.RecordDate = C.RecordDate

    )

    ORDER BY

    C.RecordDate;

    The query plan produced uses a single Top-N Sort on RecordDate, while still making effective use a parallelism:

    You are very likely to encounter similar query plan issues with some types of aggregate query:

    http://sqlmag.com/t-sql/max-and-min-aggregates-against-partitioned-tables