Why columnstore indexes are not very popular?

  • I noticed that many SQL Server shops do not even consider or discuss introducing columnstore indexes or in-memory columnstore indexes. While trying to solve performance issues with many queries in many other painful ways, this option is not even on the plate of DBAs and Developers and product owners agenda in many today's IT shops whose business in based on the backbone of MS SQL Server 2016.

    Does anybody know WHY?

    What would be the biggest objectives to using column store indexes? Highly transnational OLTP environments? Is it really a bar for trying columns store indexes? What else?

    Likes to play Chess

  • I don't have specific data, but I think it's the same reason that people are shy about using Extended Events.

    In 2008, Extended Events were introduced, but, weren't quite functional. In fact, they sucked. That was fixed in 2012, but it was already too late. Common knowledge was firmly established, don't use Extended Events because Trace is better.

    In 2012 when Columnstore indexes were introduced, they were read only. For some people, this wasn't even remotely a showstopper, but for others, there was no way they could take advantage of the index. So, common knowledge was immediately established that Columnstore indexes don't work well. Never mind that updateable columnstore indexes are now a thing. Never mind that clustered and nonclustered columnstore indexes means that you can define your data storage mechanisms to match the key aspects of your app, while still using traditional index types to offset behavior. Never mind enormous performance benefits. Common knowledge is established.

    I teach query tuning. I do it all over the world. Every time I demo columnstore and talk about it's use, people perk up, sit up in their seats. They just don't know about it.

    I think what happens is that people are busy. There's only so much time to learn stuff. So, something gets introduced. You evaluate it. It's good (Query Store), or it sucks (Extended Events, Columnstore), and then you don't reevaluate it ever again because you're busy.

    My god how much time do we spend fighting the concept that Page Life Expectancy isn't a very good measure at the best of times, but it can be less than 300 and your server is fine.

    I'm turning this into a blog post.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Because columnstore and non columnstore indexes are designed for different use cases.  And because of that column store indexes are not an automatic solution for performance issues and simply replacing all your normal indexes with column stores is likely to cause more problems depending on the application.

  • Memory and standard edition might be one reason. Likely ignorance is the biggest cause. People don't think about them, or as Grant mentioned, heard about early limitations.

  • its not for all things - I have a particular case on a DB that I am working at the moment.

    big table - 150 Million records -

    Most common sql on it is

    select ..40 columns .. into #temptable from tbl where reportdate = (select max(reportdate) from tbl)

    above select would return 2 million rows per report date

    with a clustered index it is quite fast - with a columnstoreindex it takes a few mins to run.

    note that on the same db we have most of the other tables as columnstores - its just this particular one where it goes slower - may be due to lower memory on server or so - but just to show that its not always better

    Now if we were doing aggregations on that table then it would most likely be faster as columnstore index are quite good for that.

    For other usages  , as usual, "IT DEPENDS" - can be better or can be worst

  • Oh yeah, I'm not complaining that people don't use it EVERYWHERE. I'm complaining that people don't use it, even in places where it would offer a clear advantage.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant,

    Thank you so much for your input. (I have been spot-reading your book, by the way, for years. It has been VERY helpful as well).

    Our company, after I shared about the columnstore indexes benefit and shared your opinion as well, got very interested. So a proof of concept is now needed, so that all DBAs, Architects, and Developers can observe an obvious advantage of using such indexes as opposed to not using them. In that respect, how would you suggest that I put together such demo/proof of concept?

    Should I use, for example, something like an AdventureWorks database, blow up one of the tables to 10 million rows, run an inefficient SP against it and then create a columnstore index and then run it again and see a mindblowing speed increase? Something like that? Or use something from our existing environment, like an obviously slow running process, create a shadow table for it, blow it up with more millions of rows (may be in Dev or SAT servers not Prod..) and create a columnstore index on it and demo the benefit of it to all involved data engineers?

    Also, the second part of the question: in case something like that could be necessary, would you personally be possibly available some time in January or February to participate for a couple of days as a consultant on such effort once we decide to implement it on a wider basis after the demo?

     

    Thank you.

    Likes to play Chess

  • Also, question regarding your book 'SQL Server 2017 Query Performance Tuning': If we use SQL 2016 Enterprise, not 2017, are our developers going to benefit from this book? I mean, do you refer in this book to many features that are new for 2017 and do not exist in 2016? Or this book will be as beneficial for working with 2016 as it is with 2016?

    THANK YOU!

    Voldemar

    Likes to play Chess

  • Aggregates and analysis queries. That's where it shines. A really simple GROUP BY on a larger data set will immediately show the benefits. Columnstore works well with aggregates and large sets, not point lookups. Point lookups are still best served by regular b-tree indexes. It doesn't take much, but it has to be the right kind of query.

    As to consuling, I can do a little. Scheduling is the hard part. I have to take vacation from work in order to do it. Therefore, I don't do much.

    Yes, the book very much applies to 2016. 90% or more of the book will be directly applicable to 2016.

     

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Is it a pre-requisite for implementing Columnstore indexes to have the table Memory Optimized?   (=in-memory OLTP?)

    Likes to play Chess

  • no need for Memory Optimized.

    example

    create table abc

    (id int)

    create clustered columnstore index CCI_abc on abc with (data_compression = columnstore)

  • Grant,

    what would be a short checklist

    of 1,2,3,... of things that DEFINITELY qualify a table for turning it into Memory Optimized and a list of COUNTERINDICATIONS against turning it into in-memory and using Columnstore indexes?

     

    Thank you!

    Likes to play Chess

  • I understand that using this below will be only a part of it.

    Making sure row is not longer than 8060 bytes is another..  what else?

    -------------------------------------------------------

    -- The queries below need to be executed per database.

    -- Also, please make sure that your workload has run for

    -- couple of days or its full cycle including ETL etc

    -- to capture the relevant operational stats

    -------------------------------------------------------

    -- picking the tables that qualify CCI

    -- Key logic is

    -- (a) Table does not have CCI

    -- (b) At least one partition has > 1 million rows and does not have unsupported types for CCI

    -- (c) Range queries account for > 50% of all operations

    -- (d) DML Update/Delete operations < 10% of all operations

    select table_id, table_name

    from (select quotename(object_schema_name(dmv_ops_stats.object_id)) + N'.' + quotename(object_name (dmv_ops_stats.object_id)) as table_name,

    dmv_ops_stats.object_id as table_id,

    SUM (leaf_delete_count + leaf_ghost_count + leaf_update_count) as total_DelUpd_count,

    SUM (leaf_delete_count + leaf_update_count + leaf_insert_count + leaf_ghost_count) as total_DML_count,

    SUM (range_scan_count + singleton_lookup_count) as total_query_count,

    SUM (range_scan_count) as range_scan_count

    from sys.dm_db_index_operational_stats (db_id(),

    null,

    null, null) as dmv_ops_stats

    where (index_id = 0 or index_id = 1)

    AND dmv_ops_stats.object_id in (select distinct object_id

    from sys.partitions p

    where data_compression <= 2 and (index_id = 0 or index_id = 1)

    AND rows >= 500000 -- 1048576

    AND object_id in (select distinct object_id

    from sys.partitions p, sysobjects o

    where o.type = 'u' and p.object_id = o.id))

    AND dmv_ops_stats.object_id not in ( select distinct object_id

    from sys.columns

    where user_type_id IN (34, 35, 241)

    OR ((user_type_id = 165 OR user_type_id = 167) and max_length = -1))

    group by dmv_ops_stats.object_id

    ) summary_table

    where ((total_DelUpd_count * 100.0/(total_DML_count + 1) < 10.0))

    AND ((range_scan_count * 100.0/(total_query_count + 1) > 50.0))

    Likes to play Chess

  • VoldemarG wrote:

    Grant,

    what would be a short checklist

    of 1,2,3,... of things that DEFINITELY qualify a table for turning it into Memory Optimized and a list of COUNTERINDICATIONS against turning it into in-memory and using Columnstore indexes?

    Thank you!

    But you don't have to set up memory optimized files to use this. Not in any way. It's just an index inside SQL Server. Yes, for some idiotic reason the marketing team decided that it needs to say MEMORY OPTIMIZED, but it has nothing to do with the in-memory functionality. This is NOT a reason to avoid the use of these indexes.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • VoldemarG wrote:

    Grant,

    what would be a short checklist

    of 1,2,3,... of things that DEFINITELY qualify a table for turning it into Memory Optimized and a list of COUNTERINDICATIONS against turning it into in-memory and using Columnstore indexes?

    Thank you!

    I think I misunderstood the question. First up, just call them Columnstore indexes. Don't buy into the marketing Memory optimized crap. Second, aggregates. Analysis. These are where they shine. If you're not seeing GROUP BY in the queries, I wouldn't suggest using them.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

Viewing 15 posts - 1 through 15 (of 46 total)

You must be logged in to reply to this topic. Login to reply