Composite clustered index ordering when inserting

  • Hello,

    How's a composite clustered index key is ordered?

    With a non-composite cluster index, the row are sorted using the clustered index key. Therefore it is often better adding at the end of the clusted index than in the middle of it.

    I'm asking because I'm wondering how's a composite clustered index is ordered to avoid inserting in the middle of it and force SQL to moves pages around.

    Using the following code, how's the ID will be ordered inside the clustered index?

    CREATE TABLE dbo.theTable

    (

    intValue1 int NOT NULL

    , intValue2 int NOT NULL

    )

    GO

    ALTER TABLE dbo.theTable ADD CONSTRAINT PK_theTable PRIMARY KEY CLUSTERED

    (

    intValue1

    , intValue2

    )

    INSERT INTO dbo.theTable

    SELECT 1,1

    UNION ALL

    SELECT 1,2

    UNION ALL

    SELECT 1,3

    UNION ALL

    SELECT 2,1

    UNION ALL

    SELECT 2,2

    UNION ALL

    SELECT 3,1

    After those insert, if I add:

    INSERT INTO dbo.theTable

    SELECT 1,4

    Will this cause performance issue with the clustered index?

    Thks

  • The insert will occur inside the clustered index, and if there is not enough space, a page split will occur.

  • Thank you,

    Therefore I presume by your answer that data inside the clustered index will be ordered like:

    Ordering ->

    1.1, 1.2, 1.3, 2.1, 2.2, 3.1

    Sorted by the first key, then by the second one and so on.

  • Yes, sorted in that order. The data is stored like that on pages.

    Note that page splits happen all the time, updates cause them when the new values don't fit on the pages or you insert inside the clustered index. This isn't necessarily a performance problem unless you are inserting or changing lots of data.

  • True,

    But I would like to avoid page split on a table with a possible 1,500 millions rows in it.

    If it needs to insert at the beginning of the clustered index and a split occurs, the app will have to wait forever on SQL to moves the pages around to make room for the new row.

    Disk space is an issue but far less than performance, therefore for that specific table I might forfeit the clustered and do a non-clustered index with the composite keys in it having all others columns added as include. (Keeping the table as a heap)

    This will take a lot of space but will avoid the page split. (No others columns will fit to be clustered)

  • You'll still possibly end up with page splits in the nonclustered index AND you'll be dealing with the fact that your structure isn't taking advantage of a good clustered index. In general, I'd stick with setting up the clustered index.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Yes but from my understanding, time to perform a page split from a non clustered index <= time spent for a page split from a clustered index.

    If that split occurs at the beginning of the data this would be even more noticeable (several times faster).

    Therefore I would be comfortable living with this kind of page split.

    And for the clustered index there's no other good candidates unless I go into a non-unique clustered index (which I want to avoid if possible).

    Still it's a good problem I'm on 🙂

    Any ideas are always welcome.

    Thks

  • Maybe worth a read - http://www.sqlservercentral.com/articles/Indexing/68563/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Megistal (8/10/2010)


    Yes but from my understanding, time to perform a page split from a non clustered index <= time spent for a page split from a clustered index.

    If that split occurs at the beginning of the data this would be even more noticeable (several times faster).

    Therefore I would be comfortable living with this kind of page split.

    A page split is a page split. Allocate new page, move some of the data onto the new page, set the index links up. It's not going to be faster towards the beginning of the index than in the middle and it's pretty much the same amount of work for a clustered or nonclustered index. After all, they don't look radically different if you're looking at the level of index pages.

    It's at the end (where new data has index value > anything existing) where the allocation of new pages is cheaper, because no rows have to be moved.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I think you're assuming that SQL Server would move all the pages in the clustered index if a split occurs. It wouldn't. It would only move the information on the page in question. It's possible that rebuilding an index (but not defragging) could be longer due to page splits at the top of the key, but not a single page split.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • GilaMonster (8/10/2010)


    Megistal (8/10/2010)


    Yes but from my understanding, time to perform a page split from a non clustered index <= time spent for a page split from a clustered index.

    If that split occurs at the beginning of the data this would be even more noticeable (several times faster).

    Therefore I would be comfortable living with this kind of page split.

    A page split is a page split. Allocate new page, move some of the data onto the new page, set the index links up. It's not going to be faster towards the beginning of the index than in the middle and it's pretty much the same amount of work for a clustered or nonclustered index. After all, they don't look radically different if you're looking at the level of index pages.

    It's at the end (where new data has index value > anything existing) where the allocation of new pages is cheaper, because no rows have to be moved.

    You must have just edged me by seconds on this one.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Whoops!

    Yes I was assuming this:

    I think you're assuming that SQL Server would move all the pages in the clustered index if a split occurs.

    Well I do have learn something new today about clustered index.

    Thks to both of you.

  • Grant Fritchey (8/10/2010)


    It's possible that rebuilding an index (but not defragging) could be longer due to page splits at the top of the key, but not a single page split.

    I would have thought the other way around. Rebuild is creating a new index then swapping it for the old one, reorganise is shuffling of pages into the correct order. Rebuild should be much the same work regardless of how much fragmentation and where it is, reorganise not so much.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (8/10/2010)


    Grant Fritchey (8/10/2010)


    It's possible that rebuilding an index (but not defragging) could be longer due to page splits at the top of the key, but not a single page split.

    I would have thought the other way around. Rebuild is creating a new index then swapping it for the old one, reorganise is shuffling of pages into the correct order. Rebuild should be much the same work regardless of how much fragmentation and where it is, reorganise not so much.

    Hmmm... But I thought that the reorganise was pretty light in it's touch, just eliminating blank pages & such and relinking the lists? I would have thought rebuilding the entire index would be more painful... May need to set up some tests on this one.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (8/10/2010)


    Hmmm... But I thought that the reorganise was pretty light in it's touch, just eliminating blank pages & such and relinking the lists? I would have thought rebuilding the entire index would be more painful... May need to set up some tests on this one.

    Agreed, but since the rebuild recreates the entire index regardless of what state it's in, it shouldn't be affected by how fragmented the index is, whereas the reorganise likely will be. Rebuild's certainly more intense, not disputing that.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply