ColumnStore Indexes not releasing the Deleted Rows

  • In on table which has CCI columnstore indexes on it, and I run

    select *,100*(ISNULL(deleted_rows,0))/NULLIF(total_rows,0) AS 'Fragmentation' from sys.dm_db_column_store_row_group_physical_stats

    I see trim_reason_desc which has DICTIONARY_SIZE.

    I found this

    https://feedback.azure.com/forums/908035-sql-server/suggestions/32897320-clustered-columnstore-index-reorg-doesn-t-remove-d

     

    Does anyone know how you get rid of the deleted rows in a CCI ?

     

  • That worked...I guess I keep an eye on it I had it do rebuild if > 30% fragmentation

  • It's not about fragmentation. It's about the delta store, the place where the deletes are kept. You want to watch that and rebuild based on clearing that up, not fragmentation.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Wow, I need to amend my CCI Index rebuild script

    ---First do this

    1.  ALTER INDEX idx_cci_someindex ON [sometable]  REORGANIZE WITH (COMPRESS_ALL_ROW_GROUPS = ON);

    SELECT * FROM sys.dm_db_column_store_row_group_physical_stats WHERE object_id  = object_id('CCI__someindex')

    ORDER BY row_group_id;

    Each group gets compressed

    i.e COMPRESSED 37500

    i.e COMPRESSED 37500

    i.e COMPRESSED 37500

    i.e COMPRESSED 37500

    2. Then you run it again and it does ALTER INDEX idx_cci_target ON cci_target REORGANIZE;

    COMPRESSED 300000  (total of above)

    3. If lots of deletes etc, then you get the 'DICTIONARY_SIZE' then you do a rebuild.

    select *,100*(ISNULL(deleted_rows,0))/NULLIF(total_rows,0) AS 'Fragmentation' from sys.dm_db_column_store_row_group_physical_stats where

    state_desc = 'Dictionary_Size'

    4. REBUILD a partion after loading data

    Removes the Partition with multiple deltastores

    ALTER INDEX [CCI__someindex] ON [dbo].[sometable] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = COLUMNSTORE)

  • Once you rebuild all partitions how do you find which partition to build, as I would think I don't want to rebuild all partitions every time as that would take a long time.

  • TRACEY-320982 wrote:

    Once you rebuild all partitions how do you find which partition to build, as I would think I don't want to rebuild all partitions every time as that would take a long time.

    sys.dm_db_index_physical_stats returns a partition column to identify the partition(s) of an index.

    I'll also state that the use of REORGANIZE should be avoided except when you must absolutely re-compress LOBs.  Otherwise, it can cause a great deal of collateral damage because it removes critical free-space from an index without adding it to the most critical part of an index, which are the pages that are fuller than than the Fill Factor if the Fill Factor isn't 0/100.  If it is 0/100, then it just removes free-space at the worst time possible.

    As for waiting for 30% to do a rebuild, why are you waiting for the largest amount of page splits and extra log file usage (which is actually a whole lot larger than what occurs for a REBUILD) to occur?

    Stop using the god awful "standard" of REORGANIZE between 10 and 30 and waiting to REBUILD at 30.  What should you do instead?  Start out by doing a deep study of the documentation for sys.dm_db_index_physical_stats instead of the "implied" standard of 10/30, which MS insisted Paul Randal to come up with.  It's intentionally super generic and fits nearly nothing practical in this day an age.  Even Paul Randal states something similar in his live presentations.

    Using a "one-size-fits-all" solution is killing performance and causing excess log usage.  It can even cause massive blocking on the morning after index maintenance as it has for me and several others.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks i am looking at sys.dm_db_index_physical_stats, a lot of fragmentation 100% with dictionary size.

    I will rebuild all these tables.

     

    For regular reorganize I will research a little more.

  • Don't reorganize ever. It's a waste of time. However, it's a particular waste of time on columnstore indexes. Rebuild or don't. That's it.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply