Clustered vs Nonclustered Columnstore Update Performance

  • Hi Everyone,

    I'm experiencing horrendous update performance with Clustered Columnstore Indexes (CCI). Whilst our current Nonclustered Columnstore Index (NCCI) setup is not going to win any update-performance awards, it works sufficiently well.

    In my test setup,  I can update 1 mil rows in about 90 seconds using NCCIs. Inserting 1 mil rows (both for CCI and NCCI) takes 30 seconds.

    When I run those exact same updates on the same table but indexed with a CCI + Nonclustered Unique index (on the column that would otherwise be the PK), the 1 mil row update take >1 hour - I ended up just aborting the test.

    As I was reading that a common workaround seems to be to first delete those rows and re-insert, I've tried that. But the delete operation also took north of 1 hour.

    As our SQL Databases are populated by an external systems ETL process, we're essentially updating it every 30 minutes with the changes from the external systems (based on created/updated timestamps). Therefore, the 1 mil row updates operate on individual rows (via the PK), which I understand to be the worst case for a columnstore.

    Is there no way of directly writing to the CCI delete bitmap without really accessing the CCI row groups at all, given that the key used in the delete/update command also has a B-Tree index? I don't mind rebuilding our CCI's every night or so.

    To be clear, NCCI's work great for us, I'm just a bit annoyed about the fact that we waste a lot of space with our rowstore as our workload is (apart from the row updates) otherwise entirely OLAP in nature and doesn't really hit the rowstore at all.

    I'd appreciate any advice you guys might have!

    Cheers

  • Have you tested without the nonclustered index? If it's not used, why have it at all? And yeah, lots of inplace updates are very hard on a columnstore index. Everything has to go through the delta store first, then it gets moved to the index second. You might try batching the inserts & updates at 100k per batch. Niko Neugebauer has some good information showing how this stuff works. In fact, I'd look through Niko's blog to see his suggestions on this topic. He's the man when it comes to columnstore indexes.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Yes, because of the way ColumnStore data is organized, it is great for read performance but poor for write performance. However, the most efficient way to mitigate something (in this case updates or deletes on a ColumnStore table) is to bypass it altogether.

    For example, one thing I developed and maintain is a data mart consisting of a wide ColumnStore table. When a fact is updated, a column called EffectiveEndDate is populated with current date, and a new replacement row is inserted into this fact table. Now, the trick is that EffectiveEndDate and a couple of other updatable columns are contained in a regular RowStore table which acts like a Type 2 slowly changing dimension. So, I'm never updating the ColumnStore table at all. A view joins the ColumnStore (fact) and RowStore (dimension) tables together, applying WHERE clause on EffectiveEndDate to return only effective rows. Maybe this pattern would fit your use case too.

    https://www.kimballgroup.com/2008/09/slowly-changing-dimensions-part-2/

     

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • I initially figured the nonclustered index would help with the updates/deletes - it didn't. I also just ran a test without the nonclustered index with only a minor improvement.

    Yeah, Niko Neugebauer is great, I've previously read most of his articles on SQL columnstores. I just assumed that, as some of his articles are rather old, things might've changed in the meantime. The article Grant linked to in particular is 7 years old after all. The example queries he runs are also setup in a way that is well suited to columnstores. "DELETE TOP (1000000) dbo.FactOnlineSales" will always outperform "DELETE dbo.FactOnlineSales WHERE id IN (1,2,3,4,10,25)" and vastly more so in columnstores. Deleting 1 mil rows that way runs in 6 seconds in my example case.

    My main confusion is also not with the poor update performance but rather with the poor delete performance when compared to NCCIs. My understanding is that the delta store should not actually have to be involved in delete operations. I don't see why SQL Server does not simply store a "deletion bit mask" of all deleted rows that is referenced each time a rowgroup is scanned. This would require some additional structure (like an internal ID to match to the rows during scans) but that sounds like a totally reasonable trade off for potentially faster-than-rowstore deletes. Then again, I'm no DBA, I might be totally out of order here.

    I've also tested smaller batches with some encouraging but also confusing results. Deleting rows in batches of 100k takes about a minute, deleting in 40k batches takes only 12 seconds, yielding a per-minute performance of 210k rows. This is actually starting to approach acceptable performance for our usecase. The reasoning for this is totally beyond me.

    Regarding Erics suggestion, thats definitely a nifty idea. In our case, I'd be a bit concerned about the amount of data growth that we would produce this way but that could possibly be manageable. The complexity of implementing this across the thousands of tables that we currently need to maintain however...that might make this a non-starter.  Food for thought at any rate though, thanks!

    I'm curios to see if somebody might still have an idea but thanks alot for the suggestions so far!

    Cheers

    • This reply was modified 2 years, 10 months ago by  MadOwl.
  • Have you tried PAGE compression rather than going all the way to a columnstore format?

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • I've not had to work with ColumnStore yet but that sounds perfect, Eric.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi ScottPletcher, I have not seriously considered page compression instead of (N)CCI's as not using any columnstores would require us creating B-Tree indexes to keep query performance sane. In our use case, that's a no-go. Our analysts are running queries against the SQL database via Power BI, so even they don't really know what SQL queries they're writing (as Power BI creates the queries for them), let alone the guys that maintain the SQL database (mostly me). This is the exclusive purpose of the database, no other application accesses it.

    Add to that that we have ~4000 tables to maintain, the only way we could do this is by having the entire life-cycle (from inception to deletion) of our indexes be managed automatically. I'd be happy to checkout a software that does this, but I'm sure as hell not writing it^^

    • This reply was modified 2 years, 10 months ago by  MadOwl.
  • MadOwl wrote:

    Add to that that we have ~4000 tables to maintain...

    Are they mostly ColumnStore or RowStore?

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • MadOwl wrote:

    ...

    Regarding Erics suggestion, thats definitely a nifty idea. In our case, I'd be a bit concerned about the amount of data growth that we would produce this way but that could possibly be manageable. The complexity of implementing this across the thousands of tables that we currently need to maintain however...that might make this a non-starter.  Food for thought at any rate though, thanks!

    I'm curios to see if somebody might still have an idea but thanks alot for the suggestions so far!

    Cheers

    In my case, because the ColumnStore table serves as something like a historical fact table in a DW, retaining versioned history of all inserted rows is a business requirement, so there are no deletes. However, because EffectiveEndDate and some indicator columns are updatable following each night's incremental load process, I moved them to RowStore. That versioned indicator table actually gets truncated and re-populated on occasion when business requirements change, like when new columns are added.

    In your case, if bulk updating and deletion of entire rows is a regular occurring theme, then ColumnStore may not be best choice. But I guess it's hard to refactor the database design when there are - "thousands of tables".

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • MadOwl wrote:

    Hi ScottPletcher, I have not seriously considered page compression instead of (N)CCI's as not using any columnstores would require us creating B-Tree indexes to keep query performance sane. In our use case, that's a no-go.

    Hmm, not sure why creating non-clus indexes would be a "no-go"?!

    MadOwl wrote:

    Add to that that we have ~4000 tables to maintain, the only way we could do this is by having the entire life-cycle (from inception to deletion) of our indexes be managed automatically. I'd be happy to checkout a software that does this, but I'm sure as hell not writing it^^

    I manage thousands of dbs (across a number of instances), each with dozens or hundreds of indexes.  Yeah, it takes time to get it under control but, once you do, it's not that bad.  Index analysis can't really be automated, but, yes, I've automated some of the actual creation of indexes.<i></i>

    The key is to get the best clustering index that matches most queries / joins.  Then add non-clus indexes as needed.

    Page compression can help tremendously by significantly reducing the I/O needed to get the data.  Writes are somewhat slower -- and large index rebuilds are noticeably slower -- but in general it's a good trade off.  Only if you're CPU bound already should you not consider page compression.

    Row compression doesn't gain nearly as much, but there's no noticeable direct overhead from it.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Jeff Moden wrote:

    MadOwl wrote:

    Add to that that we have ~4000 tables to maintain...

    Are they mostly ColumnStore or RowStore?

    They are exlusively Nonclustered Columnstores.

    ScottPletcher wrote:

    MadOwl wrote:

    Hi ScottPletcher, I have not seriously considered page compression instead of (N)CCI's as not using any columnstores would require us creating B-Tree indexes to keep query performance sane. In our use case, that's a no-go.

    Hmm, not sure why creating non-clus indexes would be a "no-go"?!

    I manage thousands of dbs (across a number of instances), each with dozens or hundreds of indexes.  Yeah, it takes time to get it under control but, once you do, it's not that bad.  Index analysis can't really be automated, but, yes, I've automated some of the actual creation of indexes.<i></i>

    Because the queries are ad-hoc. Most queries are not reused more than ~100-200 times. Nobody actually writes SQL here, all SQL queries are machine generated. My previous benchmarks also indicated that even well designed B-Tree indexes underperformed columnstores. Most queries return 80-100% of the entire table.

    Eric M Russell wrote:

    In your case, if bulk updating and deletion of entire rows is a regular occurring theme, then ColumnStore may not be best choice. But I guess it's hard to refactor the database design when there are - "thousands of tables".

    As previously stated, NCCI's work just fine for us, the only issue excessive data storage - NVME SSD are cheap though, so its not a deal breaker. Just seems ineffcient

    Cheers

     

  • Columnstores perform better because they compress data extremely well.

    If you a rowstore that's totally uncompressed, the difference will be massive.  If the rowstore is page compressed, at least it has a chance to perform reasonably well.

    Yes, if you return 80-100% of the table every time, columnstore would likely be better.  That makes it seem very likely that Power BI is likely reading (almost) all the rows, then caching them locally and then do reporting / analysis from the local cache.  If that's the case, there might be better methods of doing that, but, yes, it could get complex.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • ScottPletcher wrote:

    Yes, if you return 80-100% of the table every time, columnstore would likely be better.  That makes it seem very likely that Power BI is likely reading (almost) all the rows, then caching them locally and then do reporting / analysis from the local cache.  If that's the case, there might be better methods of doing that, but, yes, it could get complex.

    That's exactly right. As Power BI reports are interactive and support ad-hoc filtering and aggregations based on user interactions, the best practice approach within Power BI is to load all relevant data into the Power BI data model. The reason we're using microsofts SQL Server at all is that Power BI is most compatible with it. Certain features of it don't work well / at all with other DBs or warehousing solutions. Getting a bit off-topic here though.

    Thanks everyone for the suggestions so far! I'll do some more testing and report back with the approach we chose to go for.

  • PowerBI will cache - but if using direct query it will also push the query down to SQL Server - and in this case it will shine if doing aggregations with columnstore (which most likely is)

    but even with not using direct query - if doing import data and if done in a certain way it will also just load aggregated data (up to a level) and gain in this case it will behave better than using a rowstore table.

    there is no straight unique answer but if using a BI tool on top of the table then using columnstore is likely the best option - saves space, and is rarely used (if BI tool is caching data), or is used in the way columnstore behaves better (doing aggregations)

     

     

  • Hi Everyone,

    just wanted to give an update on the approach we ended up going with:

    Further analysis of write behavior of different tables showed that our largest and most resource consuming tables (billions of rows) happened to also be the ones that actually never really required update operations. So we decided to migrate that class of table to CCI's and simultaneously disabled update operations for them entirely. The remainder of tables rarely grow beyond ~30 million rows and the inefficiencies introduced by NCCI's are acceptable for us there.

    I also took this opportunity to benchmark B-Tree indexes for our workloads again and re-affirmed that columnstores perform better. The query optimizer does actually not even use them and forcing the use of those indexes performed worse. I guess the way we use the database is closer to a data warehouse than a traditional relational database.

    Thanks to everyone for their input!

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply