ColumnStore Index Maintenance routines

  • Hi all

    I use the Ola Hallengren Index Optimize script for rowstore index maintenance in my SQL estate. Recently we have been looking into an issue with our data warehouse where an overnight batch job is stalling. The system was developed by a third party - they have mentioned that there are ColumnStore indices that are 'fragmented' - i.e. percentage of deleted rows is more than 20% (hope this is correct). These are not being addressed by Ola's script as it doesn't cover ColumnStore indices. Does anyone have a CC optimization maintenance routine in place that they can share with me, or point me towards? I don't have much knowledge on CC indexes and from my earlier googles, maintaining them does not seem particularly clear with some articles suggesting they maintain themselves?

    Any tips will be appreciated!

    Thanks

    Doodles

  • You can rebuild and reorganize columnstore indexes the same as rowstore. The reasons you do this are different though. It's not about fragmentation, because of how columnstore indexes are defined, fragmentation just isn't the same kind of problem. Instead, it's about the delta store and compression. Inside a columnstore index, changes do not go directly to the index. Instead, they first go to what is called the delta store or delete bitmap. These are actually traditional rowstore b-tree indexes inside the columnstore. A delete from a columnstore is a logical delete, marked in the bitmap. An insert is just added to the delta. An update is a delete and an insert in the delta store. Over time, after about a million rows, you'll see a delta converted to a row group in the column store. However, in the mean time, lots and lots of extra reads are needed to deal with the deletes & data changes, slowing down the columnstore. So, a rebuild will perform actual deletes against the columnstore proper, clearing out the bitmap, and insert the rest of the data into properly pivoted and compressed columnstore storage. Reorganize will do most of the same work. You can read this article from Microsoft on the topic (yeah, there's still some controversy over this article, but I think the columnstore info in there is pretty good).

    Probably, in most situations, for columnstore, and columnstore only, you're probably best served by using reorganize rather than rebuild. However, with all things, testing is your best buddy. Follow the links in the article to examples of how to reorganize a columnstore index.

    ----------------------------------------------------
    The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood...
    Theodore Roosevelt

    The Scary DBA
    Author of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd Edition
    Product Evangelist for Red Gate Software

  • doodlingdba wrote:

    Hi all

    I use the Ola Hallengren Index Optimize script for rowstore index maintenance in my SQL estate. Recently we have been looking into an issue with our data warehouse where an overnight batch job is stalling. The system was developed by a third party - they have mentioned that there are ColumnStore indices that are 'fragmented' - i.e. percentage of deleted rows is more than 20% (hope this is correct). These are not being addressed by Ola's script as it doesn't cover ColumnStore indices. Does anyone have a CC optimization maintenance routine in place that they can share with me, or point me towards? I don't have much knowledge on CC indexes and from my earlier googles, maintaining them does not seem particularly clear with some articles suggesting they maintain themselves?

    Any tips will be appreciated!

    Thanks

    Doodles

    On a related note, if you're using REORGANIZE on your RowStore indexes, you're probably perpetuating page splits and using a whole lot more log file that you need to.  REORGANIZE doesn't work the way most people think it does.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply