index column order for a clustered primary key

  • Recently I noticed that some of the indexes supporting primary keys on our tables did not have the column order that was specified in the primary key definition for those tables. The result was that sometimes sub-optimal query plans were generated for queries against these tables. I started investigating and found that the index columns were ordered by their physical location in the row, not that of the key definition. I also found that this happens only on clustered primary keys: if I change the primary key from clustered into nonclustered, the index follows the column ordering in the primary key definition. When I change it back into a clustered key, the index follows the physical ordering of the columns again. The only way to make the index column order on a clustered primary key as I need it, is by re-arranging the physical column layout on the table.

    I initially found and tested this on Microsoft SQL Server 2008 R2 (SP1) - 10.50.2500.0 (X64), but the same behavior is shown in Microsoft SQL Server 2005 - 9.00.4266.00 (X64).

    I can nowhere find anything about this behavior in Books Online or MSDN. Does anyone have a link that makes this behavior official? i.e. so I can define an (internal) procedure that the physical column order must be properly specified to get the correct indexes?

    Consider the following test tables, note that the column order on the primary key differs from that on the tables:

    use tempdb

    go

    create table dbo.test (

    colA int not null,

    colB int not null,

    constraint pkTest primary key clustered (colB, colA)

    );

    go

    create table dbo.test2 (

    colA int not null,

    colB int not null,

    constraint pkTest2 primary key nonclustered (colB, colA)

    );

    go

    Below query lists the index columns for all indexes on these tables. Once ordered by their index_column_id and once by their key_ordinal. It demonstrates that a difference only occurs on clustered indexes.

    select object_schema_name(ix.object_id) + '.' + object_name(ix.object_id) as

    ,

    ix.name as [index],

    ix.type_desc,

    stuff((

    select ', ' + col_name(sc.object_id, sc.column_id) as [text()]

    from sys.index_columns sc

    where sc.object_id = ix.object_id

    and sc.index_id = ix.index_id

    order by sc.index_column_id

    for xml path('index-column'), type

    ).value('.','varchar(max)'), 1, 2, '') as [index columns],

    stuff((

    select ', ' + col_name(sc.object_id, sc.column_id) as [text()]

    from sys.index_columns sc

    where sc.object_id = ix.object_id

    and sc.index_id = ix.index_id

    order by sc.key_ordinal

    for xml path('index-column'), type

    ).value('.','varchar(max)'), 1, 2, '') as

    from sys.indexes ix

    where ix.object_id in (object_id('dbo.test'), object_id('dbo.test2'))

    order by 1, ix.index_id

    Both tables have physical column order colA, colB, but only if the primary key is nonclustered, the index has the column order as I defined it: colB, colA. For the clustered primary key the column order is now colA, colB.

    table index type_desc index columns key columns

    dbo.test pkTest CLUSTERED colA, colB colB, colA

    dbo.test2 NULL HEAP NULL NULL

    dbo.test2 pkTest2 NONCLUSTERED colB, colA colB, colA

    I know this does not seem hard to fix. Simply re-arranging the columns in the table fixes the problem, but that requires downtime for our 24x7 operation. Plus it is one more thing to be aware of and we didn't know about it. So my question is: is there any documentation that describes this or is it one more "little known feature" ?

    edit: tried to improve alignment in the table



    Posting Data Etiquette - Jeff Moden[/url]
    Posting Performance Based Questions - Gail Shaw[/url]
    Hidden RBAR - Jeff Moden[/url]
    Cross Tabs and Pivots - Jeff Moden[/url]
    Catch-all queries - Gail Shaw[/url]


    If you don't have time to do it right, when will you have time to do it over?

  • There's actually nothing wrong here.... (though it certainly appears there is at first, second and 3rd glance)

    The index_column_id doesn't say that it gives the columns in the order that they appear in the index, it just says 'ID of the index column. index_column_id is unique only within index_id.' One could assume that's the key order, but apparently not in all cases. This could catch people badly. Worth a blog post I think, sometime.

    It's the key_ordinal column that shows what order the columns are within the index and that shows that you have ColB, ColA in both cases, which is exactly what you expect and what.

    From BoL: key_ordinal: Ordinal (1-based) within set of key-columns.

    If we dive down into the internals and have a look at the raw index page for table test (the one with the cluster that 'appears' to have the columns in the wrong order) you can see that they are in the order you intend in the index.

    Interesting find.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • p.s. I checked with queries too. A filter on ColB does an index seek and a filter solely on ColA does a scan, so ColB is the leading column of the index and ColA the second, as defined in the index definition.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • So I was wrong indeed (and I will quickly revoke the QoD based on this :Whistling:)

    I do think I'll put up a new one, just to illustrate how easily one can misunderstand this key_ordinal / index_column_id.

    Many thanks Gail.



    Posting Data Etiquette - Jeff Moden[/url]
    Posting Performance Based Questions - Gail Shaw[/url]
    Hidden RBAR - Jeff Moden[/url]
    Cross Tabs and Pivots - Jeff Moden[/url]
    Catch-all queries - Gail Shaw[/url]


    If you don't have time to do it right, when will you have time to do it over?

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply