How does a View work on a Table with a Clustered Columnstore Index ?

Introduction

I was asked a question on how a view works when it is created on a table that has clustered columnstore index. I, too, was unaware of how it works and was equally curious to know. This article describes what I have learned.

Explanation

There is a new pool in memory to handle columnstore operations, and it is known as the column store object pool. This is different from the buffer pool. Both the buffer pool and column store object pool reside in the memory allocated to the SQL Server.

A view stores the definition of a query and does not store any data. When we try to access data from a view, SQL Server executes the query in the view in order to provide the results. Let us see how a select is executed by SQL Server on a table that has clustered columnstore index on it.

The segments and dictionaries related to the partitions of a clustered columnstore index are pulled into the column store object pool first and then decompressed in the buffer pool before displaying the output. There could be many segments associated with a single object. So, when SQL Server displays the output, which has to be in the form of rows. The first segments along with their respective dictionaries of all the columns are pulled before proceeding to the second segment. The data is then decompressed in the buffer pool simultaneously to display the output. The pictorial representation is below.

When a select is executed on the view, Seg0 of all the columns (Col1, Col2, Col3, Col4) are pulled into the column store object pool and are then decompressed in the buffer pool before displaying the output.

Below are the observations I noticed practically. I created a table named ‘rtest’ with four columns in it. I inserted ‘test’ into all the columns and populated 4 million rows. Please note that all the fields are filled with the value ‘test’ in them.

create table rtest
(col1 varchar(100),
col2 varchar(100),
col3 varchar(100),
col4 varchar(100))

insert into rtest values('test','test','test','test')

insert into rtest select * from rtest

sp_spaceused rtest

I then created a clustered columnstore index on the table “rtest”. The space used after creating the columnstore index is shown below.

sp_spaceused rtest

I executed the below query to see the segment information for the object, “rtest”

select t1.hobt_id,t1.column_id, t1.segment_id, t1.on_disk_size
from sys.column_store_segments t1
inner join sys.partitions t2
on t1.partition_id = t2.partition_id
       where t2.object_id = OBJECT_ID('rtest')

For each column, 5 segments got created: 0,1,2,3,4. The query below gives the dictionary information related to the segments.

select 
  *
 from sys.column_store_dictionaries
 where hobt_id=72057594057392128

For each column, one dictionary got created with dictionary_id 0. Please note that the dictionary id numbering starts with 0 in SQL Server.

I wanted to clear all caches before creating the view using this code.

dbcc FREEPROCCACHE
GO
DBCC DROPCLEANBUFFERS
go
DBCC FREESYSTEMCACHE ('ALL')
GO
DBCC FREESESSIONCACHE
GO

The below query gave me the usage of the column store object pool. Initially 32 kb was allocated to the pool

select *
 from sys.dm_os_memory_cache_counters
 where type like '%columnstore%'

I used the below query to see what was stored in the cache (there should have been no rows returned since I cleared the cache earlier)

select name,type,in_use_count,entry_address,entry_data
from sys.dm_os_memory_cache_entries
where type = 'CACHESTORE_COLUMNSTOREOBJECTPOOL';

I created a view that does a ‘select’ on the table, rtest.

create view v_rtest 
as 
select 
 * 
 from rtest

I then ran a select on the view, v_rtest, to see the actions being performed at the backend by SQL Server.

select *
 from v_rtest;

The DMV sys.dm_tran_locks gives the information about the locks that are being held on database objects at that point of time. The locks shown below were held on the objects while select was running on the view.

select *
 from sys.dm_tran_locks

From the above output, we can see a lock was being held on rowgroup with resource_associated_entry_id same as hobt_id(72057594057392128) of the table “rtest”. We can check what is stored in the memory cache by using the DMV sys.dm_os_memory_cache_entries

select name,type,in_use_count,entry_address,entry_data
 from sys.dm_os_memory_cache_entries
 where type = 'CACHESTORE_COLUMNSTOREOBJECTPOOL';

In the cache entries, object_type indicates whether the entry is a segment (1) or a dictionary (2).

Check the cache entries from top to bottom. Initially, segment 0 and the related dictionary are pulled into the column store object pool and then the 1^st, 2^nd, 3^rd and so on. The information is pulled into the column store object pool, segment by segment and then decompressed in the buffer pool.

The column store object pool has risen only to 288 KB which was comprised of segment and dictionary related information.

select * from sys.dm_os_memory_cache_counters
where type like '%columnstore%'

Conclusion

SQL Server uses column store object pool for storing the column store related information before sending it to the buffer pool. It was good to know how SQL server handles memory internally for our operations and I wanted to share what I learnt. Hope my article was helpful.

How does a View work on a Table with a Clustered Columnstore Index ?

Introduction

Explanation

Conclusion

Rate

Share

Categories

Share

Rate