SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Stairway To SQL Server Columnstore Indexes Level 2: Columnstore Storage


Stairway To SQL Server Columnstore Indexes Level 2: Columnstore Storage

Author
Message
Hugo Kornelis
Hugo Kornelis
SSChampion
SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)

Group: General Forum Members
Points: 11044 Visits: 11997
Comments posted to this topic are about the item Stairway To SQL Server Columnstore Indexes Level 2: Columnstore Storage


Hugo Kornelis, SQL Server MVP
Visit my SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis
Iwas Bornready
Iwas Bornready
SSChampion
SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)SSChampion (13K reputation)

Group: General Forum Members
Points: 13878 Visits: 885
For someone like me who knows nothing about this, the article is incredible. Thank you so much.
mshparber
mshparber
SSC Rookie
SSC Rookie (31 reputation)SSC Rookie (31 reputation)SSC Rookie (31 reputation)SSC Rookie (31 reputation)SSC Rookie (31 reputation)SSC Rookie (31 reputation)SSC Rookie (31 reputation)SSC Rookie (31 reputation)

Group: General Forum Members
Points: 31 Visits: 14
Thanks for the detailed explanation!

I am still missing something...
It my user whant to see a crosstab report of NetPrice Amount by Month by Product - how the Columnstore index helps?
It still will have to match Date with Product with NetPrice at the ROW level - so it can aggregate that in March 2015 the revenues from Chairs were 20,000$

I probably missed something...
Thanks,

Michael
Hugo Kornelis
Hugo Kornelis
SSChampion
SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)

Group: General Forum Members
Points: 11044 Visits: 11997
mshparber (5/5/2015)
Thanks for the detailed explanation!

I am still missing something...
It my user whant to see a crosstab report of NetPrice Amount by Month by Product - how the Columnstore index helps?
It still will have to match Date with Product with NetPrice at the ROW level - so it can aggregate that in March 2015 the revenues from Chairs were 20,000$

I probably missed something...
Thanks,

Michael


Hi Michael,

If you have a columnstore index on a large table, then the optimizer will automatically create an execution plan that uses the columnstore index. If you are not getting the performance improvements you were hoping for, there may be several causes.

1. The columnstore index is not used. That is possible if the query is such that, according to the estimated costing done by the optimizer, it will actually take more time than using any of the other indexes. That can be true (there are actually queries that perform better with a traditional index), or it can be a mis-estimate.
2. The columnstore index is used, but the plan uses row mode execution instead of batch mode execution. That means that the query probably gains some performance from the columstore index, but far less than when batch mode would be used. This is especially common on SQL Server 2012, where batch mode was very limited and a lot of query constructions would cause SQL Server to fall back into row mode. I will cover this in detail in one of the later levels of the stairway.

If you can post a repro script (CREATE TABLE statements, a few INSERT statements with sample data, CREATE INDEX statements, and the actual query that does not perform as expected), I can take a quick look and see if I can give more details. Please test your repro script on an empty test database with case sensitive collation and US-English language settings to ensure that I can simply copy, paste and execute it on my end. Also please understand that I can take a quick look, but cannot afford to take many hours out of my consulting work to assist individual readers of this article - so if you are going to post a 10,000-line monster query, I can only respond by sending you my rates.


Hugo Kornelis, SQL Server MVP
Visit my SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis
t.franz
t.franz
Mr or Mrs. 500
Mr or Mrs. 500 (561 reputation)Mr or Mrs. 500 (561 reputation)Mr or Mrs. 500 (561 reputation)Mr or Mrs. 500 (561 reputation)Mr or Mrs. 500 (561 reputation)Mr or Mrs. 500 (561 reputation)Mr or Mrs. 500 (561 reputation)Mr or Mrs. 500 (561 reputation)

Group: General Forum Members
Points: 561 Visits: 203
any idea, if (and if yes then how) small datatypes as TINYINT (= 1 byte) could be compressed in columnstored indexes?

Its clear, that I will save space when I replace "this is a long string" with a simple tinyint (at least, when it occurse more than once), but when the column data type is already as short as possible, it could only work when you save "value 1 (which is the dictonary entry for the original tinyint CountryID 20) will used in rows 1-10, 12-50 and 90-200", but this would it make much harder to recombine rows to answer an SELECT *
Hugo Kornelis
Hugo Kornelis
SSChampion
SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)SSChampion (11K reputation)

Group: General Forum Members
Points: 11044 Visits: 11997
t.franz - Monday, April 24, 2017 8:21 AM
any idea, if (and if yes then how) small datatypes as TINYINT (= 1 byte) could be compressed in columnstored indexes?

Its clear, that I will save space when I replace "this is a long string" with a simple tinyint (at least, when it occurse more than once), but when the column data type is already as short as possible, it could only work when you save "value 1 (which is the dictonary entry for the original tinyint CountryID 20) will used in rows 1-10, 12-50 and 90-200", but this would it make much harder to recombine rows to answer an SELECT *

Yes, tinyint data is also compressed. The average space gain is less than for other data types, but it can still be relevant.

The primary compression method for tinyint will be run-length encoding (RLE). So if the data in rows 1 to 20 reads 1 / 1 / 1 / 1 / 2 / 2 / 2 / 2 / 3 / 3 / 1 / 1 / 1 / 1 / 4 / 1 / 3 / 3 / 3 / 3, then this will be stored as 1 (4) / 2 (4) / 3 (2) / 1 (4) / 4 (1) / 1 (1) / 3 (4). The algorithm that builds the columnstore index includes a step (after dividing the table into rowgroups) where the rows within a rowgroup are sorted to optimize the benefit of RLE. The "optimal" sort order for this column would after RLE result in 1 (9) / 2 (4) / 3 (6) / 4 (1). However, the sort algorithm that is used for columnstore indexes has to look at ALL columns, not just a single one.

A secondary compression method for tinyint tries to reduce the size of each value to less than 1 full byte. Let's for example say that, within a specific rowgroup, all values in your tinyint columns are NULL or between 23 and 48. The algorithm will replace NULLs with a value not in the normal range (probably 22), and then subtract 22 from each value changing the range to 0 (for NULL) and 1 - 26 (for 23-48); this can then be encoded in 5 bits instead of 8 for the value + 1 for the NULL marker. The "magic value" to represent NULL, and the offset of 22, are stored only once, in the rowgroup metadata. (You can find it in the DMV sys.column_store_segments. This DMV is introduced in level 4, though I have not shown all columns; the null value (null_value) and the offset (base_id) are two of the columns I chose not to include).



Hugo Kornelis, SQL Server MVP
Visit my SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis
t.franz
t.franz
Mr or Mrs. 500
Mr or Mrs. 500 (561 reputation)Mr or Mrs. 500 (561 reputation)Mr or Mrs. 500 (561 reputation)Mr or Mrs. 500 (561 reputation)Mr or Mrs. 500 (561 reputation)Mr or Mrs. 500 (561 reputation)Mr or Mrs. 500 (561 reputation)Mr or Mrs. 500 (561 reputation)

Group: General Forum Members
Points: 561 Visits: 203
Thank you very much for the quick answer. I always wondered, how exactly SQL Server does this stuff and your article (plus the answer) makes it a lot more clear for me.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search