Columnstore Indexes – part 103 (“Partitioning 2016 vs Partitioning 2014”)

Continuation from the previous 94 parts, the whole series can be found at http://www.nikoport.com/columnstore/.

I have been wanting to write on this topic for quite some time, but with all the exciting themes that I have on my todo list and as drafts on this blog post, prevented me until today – when I finally decided to put a couple of words and a rather simple test for the Columnstore Indexes partitioning, especially since one of the next blog posts will be dedicated to this topic in even bigger details.

Partitioning – the unsung hero of the Columnstore Indexes, one of the most essential elements for the success when using Columnstore Indexes. Rebuilding multiple billion rows table is a daunting task that even the bravest of the DBAs will not (or SHOULD NOT) take easy.

With so many incredible changes, improvements and additions in SQL Server 2016, was there something under the hood, hiding from the public eye? Something that nobody thought was worth mentioning? Something that can improve performance of the huge tables ?

Anything ?

Anything at all ?

Let’s do a quick test between 2 equal instances of SQL Server 2014 and SQL Server 2016. I will use my own generated copy of the TPCH database (1GB version), that I have done with the help of the HammerDB (free software).

Notice that to do a proper comparison, set the compatibility level to 120, even when working with SQL Server 2016:

/* 
 * This script restores backup of the TPC-H Database from the C:\Install
 */USE [master]

if exists(select * from sys.databases where name = 'tpch')
begin
alter database [tpch]
set SINGLE_USER WITH ROLLBACK IMMEDIATE;
end

RESTORE DATABASE [tpch] 
FROM  DISK = N'C:\Install\tpch_1gb_new.bak' WITH  FILE = 1, NOUNLOAD,  STATS = 1

alter database [tpch]
    set MULTI_USER;
GO

-- SET Compatibility level to 120, even when working with SQL Server 2016
ALTER DATABASE [tpch] SET COMPATIBILITY_LEVEL = 120
GO

USE [tpch]
GO
EXEC dbo.sp_changedbowner @loginame = N'sa', @map = false
GO

USE [master]
GO
ALTER DATABASE [tpch] MODIFY FILE ( NAME = N'tpch',  FILEGROWTH = 256152KB )
GO
ALTER DATABASE [tpch] MODIFY FILE ( NAME = N'tpch_log', SIZE = 1200152KB , FILEGROWTH = 256000KB )

To make results more visible, we shall need to go a little bit “crazy” with partitioning, creating daily partitioning for the 6 Million Rows table with just 2375 rows per partition on average, like I did it in Columnstore Indexes – part 94 (“Use Partitioning Wisely”):

DECLARE @bigString NVARCHAR(MAX) = '',
@partFunction NVARCHAR(MAX);

;WITH cte AS (
SELECT CAST( '1 Jan 1992' AS DATE ) testDate
UNION ALL
SELECT DATEADD( day, 1, testDate )
FROM cte
WHERE testDate < '31 Dec 1998'
)
SELECT @bigString += ',' + QUOTENAME( CONVERT ( VARCHAR, testDate, 106 ), '''' )
FROM cte
OPTION ( MAXRECURSION 5000 )


SELECT @partFunction = 'CREATE PARTITION FUNCTION fn_DailyPartition (DATE) AS RANGE RIGHT FOR VALUES ( ' + cast(STUFF( @bigString, 1, 1, '' )as nvarchar(max)) + ' )'

EXEC sp_Executesql @partFunction

CREATE PARTITION SCHEME ps_DailyPartScheme 
AS PARTITION fn_DailyPartition 
ALL TO ( [PRIMARY] );

Now let’s add a copy of the original table lineitem, by building a Clustered Columnstore Index on a new table lineitem_cci (notice that I am running the same script on the 2014 & 2016 SQL Server versions):

-- Data Loding
SELECT [l_shipdate]
      ,[l_orderkey]
      ,[l_discount]
      ,[l_extendedprice]
      ,[l_suppkey]
      ,[l_quantity]
      ,[l_returnflag]
      ,[l_partkey]
      ,[l_linestatus]
      ,[l_tax]
      ,[l_commitdate]
      ,[l_receiptdate]
      ,[l_shipmode]
      ,[l_linenumber]
      ,[l_shipinstruct]
      ,[l_comment]
  into dbo.lineitem_cci
  FROM [dbo].[lineitem];
GO
 
-- Create Clustered Columnstore Index
create clustered columnstore index cci_lineitem_cci
on dbo.lineitem_cci;

After that we shall need to build a partitioned table lineitem_cc_parti on both SQL Server 2014 & SQL Server 2016:

USE [tpch]
GO

CREATE TABLE [dbo].[lineitem_cci_part](
[l_shipdate] [date] NULL,
[l_orderkey] [bigint] NOT NULL,
[l_discount] [money] NOT NULL,
[l_extendedprice] [money] NOT NULL,
[l_suppkey] [int] NOT NULL,
[l_quantity] [bigint] NOT NULL,
[l_returnflag] [char](1) NULL,
[l_partkey] [bigint] NOT NULL,
[l_linestatus] [char](1) NULL,
[l_tax] [money] NOT NULL,
[l_commitdate] [date] NULL,
[l_receiptdate] [date] NULL,
[l_shipmode] [char](10) NULL,
[l_linenumber] [bigint] NOT NULL,
[l_shipinstruct] [char](25) NULL,
[l_comment] [varchar](44) NULL
)

GO

-- Create Clustered  Index
create clustered index cci_lineitem_cci_part
on dbo.lineitem_cci_part ( [l_shipdate] )
WITH (DATA_COMPRESSION = PAGE)
ON ps_DailyPartScheme( [l_shipdate] ); 

-- Create Clustered Columnstore Index
create clustered columnstore index cci_lineitem_cci_part
on dbo.lineitem_cci_part
WITH (DROP_EXISTING = ON)
ON ps_DailyPartScheme( [l_shipdate] );

-- Load the Data
insert into dbo.lineitem_cci_part (l_shipdate, l_orderkey, l_discount, l_extendedprice, l_suppkey, l_quantity, l_returnflag, l_partkey, l_linestatus, l_tax, l_commitdate, l_receiptdate, l_shipmode, l_linenumber, l_shipinstruct, l_comment)
SELECT [l_shipdate]
      ,[l_orderkey]
      ,[l_discount]
      ,[l_extendedprice]
      ,[l_suppkey]
      ,[l_quantity]
      ,[l_returnflag]
      ,[l_partkey]
      ,[l_linestatus]
      ,[l_tax]
      ,[l_commitdate]
      ,[l_receiptdate]
      ,[l_shipmode]
      ,[l_linenumber]
      ,[l_shipinstruct]
      ,[l_comment]
  FROM [dbo].[lineitem_cci]

To make sure that we are dealing with only compressed segments, let’s force the closure and compression of all Delta-Stores, by executing the following statement:

alter index cci_lineitem_cci_part
on dbo.lineitem_cci_part
reorganize with (COMPRESS_ALL_ROW_GROUPS = ON);

As the next step, I will use my CISL open source library to show the details on both of the partitioned tables, in order to ensure their equality:

EXEC dbo.cstore_GetRowGroups @tableName = 'lineitem_cci_part';

On my instances their look equal like I presented on the picture above with 0.12GB of space occupied and the 2528 different Row Groups

First let’s run the test query, that I have built, against the basis table containing Clustered Columnstore Index, but that was not partitioned:

select TOP 10 l_discount, SUM(l_discount * 1.123) as TotalDiscount
from dbo.lineitem_cci
where l_shipdate <= '1998-01-01'
group by l_discount
order by TotalDiscount desc

I have observed some incredible results here, and here they are:

CPU time = 640 ms,  elapsed time = 186 ms.
   CPU time = 657 ms,  elapsed time = 175 ms.

One of the lines corresponds to the SQL Server 2014, while the other corresponds to the SQL Server 2016. The numbers on my VM were pretty much constant, with some not so positive variations in the favour of SQL Server 2014 – I say because of the simpler structure.

Well, 186ms was the average result on SQL Server 2016 running under the 2014 compatibility level while 175ms of the total elapsed time was for the SQL Server 2014!

Now, let’s re-run the query, but this time against both SQL Server 2014 & SQL Server 2016 with their fully partitioned tables:

select TOP 10 l_discount, SUM(l_discount * 1.123) as TotalDiscount
from dbo.lineitem_cci_part
where l_shipdate <= '1998-01-01'
group by l_discount
order by TotalDiscount desc

What about now ? Which result belongs to which SQL Server?

CPU time = 951 ms,   elapsed time = 266 ms.
   CPU time = 1061 ms,  elapsed time = 353 ms.

266ms was the partitioned table under SQL Server 2016 (compatibility level 120) while 353ms of the total elapsed time was obtained on SQL Server 2014! This represents a solid 25% improvement

All execution plans will have the same iterators, but will differ on the overall estimated cost (the non-partitioned queries will be way lower than the partitioned ones), as well as the distribution of the estimated costs within the execution plan, but as for the rest – it will be quite similar, like the one shown on the image below:

One more thing – what about Compatibility Level 130 on SQL Server 2016?

Making sure that there is no mistake, I wanted to see how the query are performing when we enable the 2016 compatibility level (130):

USE [master]
GO
ALTER DATABASE [tpch] SET COMPATIBILITY_LEVEL = 130
GO

Running the very same queries, obtains better results, as you can see on the images below:

One of the reasons that we shall instantly notice is that the actual execution plan for the queries will suffer some changes, including TOP SORT(N) iterator running in the batch execution mode, for example:

From this point you can take is that 2016 really runs faster, especially if you are using it’s newest features

Oh, but there is one more thing: let’s change a bit the query (that was written in the way to prevent some of the more advanced features of SQL Server 2016) and run it again:

set statistics time, io on 
set nocount on

 select TOP 10 l_discount, SUM(l_discount) * 1.123 as TotalDiscount
from dbo.lineitem_cci_part
where l_shipdate <= '1998-01-01'
group by l_discount
--having SUM(l_discount) > 2500.
order by TotalDiscount desc

 select TOP 10 l_discount, SUM(l_discount) * 1.123 as TotalDiscount 
from dbo.lineitem_cci
where l_shipdate <= '1998-01-01'
group by l_discount
order by TotalDiscount desc

The actual execution plan is presented below:

and the total elapsed time are:
126 ms for the partitioned table
16 ms for the non-partitioned table.

Yes, the difference is that big and that is a common thing if you take advantage of the Aggregate Predicate Pushdown in SQL Server 2016!

Just runs faster

There have been some smaller but important improvements in the way the Columnstore Indexes partition access is being processed in SQL Server 2016 and if you migrate from SQL Server 2014 to SQL Server 2016, and you are using partitioning very heavily, you should be able to notice the positive difference. I can’t estimate if it will be 5% or 25% or 55%, but if you are working with Partitioning (and you SHOULD) – and there are a lot of them, then you might see some measurable improvements.

According to Sunil Agarwal who was the Program Manager responsible for the SQL Server 2016,

the switching to eliminated partitions during scan has been removed, because in SQL Server 2014 the switch to every partition took place, regardless if it would be scanned or not.

This is one little, but an important improvement for the SQL Server 2016 that was not hailed before to my knowledge, and I thought that showing it might help someone to make the right decision about SQL Server 2016 migration.

to be continued with Columnstore Indexes – part 104 (“Batch Mode Adaptive Joins”)

Columnstore Indexes – part 103 (“Partitioning 2016 vs Partitioning 2014”)

One more thing – what about Compatibility Level 130 on SQL Server 2016?

Just runs faster

Rate

Share

Share

Rate

Columnstore Indexes – part 103 (“Partitioning 2016 vs Partitioning 2014”)

One more thing – what about Compatibility Level 130 on SQL Server 2016?

Just runs faster

Rate

Share

Share

Rate

Related content

Book Review: Big Red - Voyage of a Trident Submarine

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

Inserting Markup into a String with SQL

Networking - Part 4

Speaking at Community Events - More Thoughts