With the introduction of Hekaton, In-Memory OLTP tables, in SQL Server 2014, there were several major changes in SQL Server’s internal processes. One such major change was in the Checkpoint processe for in In-Memory OLTP tables. The aim of this article is to compare the performance of checkpoint processes for In-Memory OLTP tables against traditional disk based tables and analyse the results
Checkpoint processes and Checkpoint File Pairs
Checkpoint processes for In-Memory OLTP tables are responsible for writing committed data from the transaction log files to data and delta files. In-Memory OLTP tables use Data and Delta file pairs (also known as Checkpoint File Pairs (CFP)) to store the data. They are
- Data Files – Files usually of size 128 mb, used to store the inserted data
- Delta Files - Files usually of size 16 mb, used to store pointers to Data files for the rows that were deleted.
Data and Delta files are always occur in pairs. In other words, there is always a Delta file for a Data File. Checkpoint processes for In-Memory OLTP tables write to smaller sized data and delta files, instead of larger data files as in disk based tables.
The subsequent sections will deal with the performance tests done to compare checkpoint processes on disk based tables and In-Memory OLTP tables.
Initial Setup – Database Creation
Two databases need to be created. One to store In-Memory OLTP tables (HKDB) and other (DiskDB) to store disk based tables. The scripts to create these databases are provided below. As we will be using a transaction log for our checkpoint performance analysis, the databases should be in FULL recovery model.
/********** Hekaton DB Creation ****************/ CREATE DATABASE HKDB ON PRIMARY(NAME = [HKDB_data], FILENAME = 'E:\MSSQL2014-UserData\HKDB_data.mdf', size=500MB), FILEGROUP [HKDB_mod_fg] CONTAINS MEMORY_OPTIMIZED_DATA (NAME = [HKDB_mod_dir_ckpt], FILENAME = 'E:\MSSQL2014-UserData\ckpt'), (NAME = [HKDB_mod_dir_delta], FILENAME = 'E:\MSSQL2014-UserLog\delta') LOG ON (name = [HKDB_log], Filename='E:\MSSQL2014-UserLog\HKDB_log.ldf', size=500MB); GO /**************** Ensure DB is on Full Recovery Model *******/ALTER DATABASE HKDB SET RECOVERY FULL WITH NO_WAIT GO /*********** Disk Database Creation ********/USE [master] GO /****** Object: Database [DiskDB] Script Date: 29/08/2015 6:25:23 PM ******/CREATE DATABASE [DiskDB] CONTAINMENT = NONE ON PRIMARY ( NAME = N'DiskDB', FILENAME = N'E:\MSSQL2014-UserData\DiskDB.mdf' , SIZE = 1024000KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1024000KB ) LOG ON ( NAME = N'DiskDB_log', FILENAME = N'E:\MSSQL2014-UserLog\DiskDB_log.ldf' , SIZE = 512000KB , MAXSIZE = 2048GB , FILEGROWTH = 512000KB ) GO ALTER DATABASE [DiskDB] SET RECOVERY FULL WITH NO_WAIT GO
To ensure that the transaction log doesn’t get auto truncated, a full backup of the newly created databases needs to be taken. The script for this is provided below.
BACKUP DATABASE [DiskDB] TO DISK = N'E:\MSSQLBackup\DiskDB.bak' BACKUP DATABASE [HKDB] TO DISK = N'E:\MSSQLBackup\HKDB.bak'
Initial Setup - Table Creation
The Script below will create two tables, one In-Memory OLTP table and another disk based table. Both the tables are exactly identical in structure. The data column with the data type of uniqueidentifier is selected as the primary key (PK). In HK_tbl (the In-Memory OLTP table), the primary key constraint is enforced using a hash index. The Hash Bucket count is selected to be 40 million as we will be inserting little over 40 million rows.
use hkdb; GO CREATE TABLE dbo.HK_tbl ( [ID] Int identity(1,1) Not null , [Data] uniqueidentifier PRIMARY KEY NONCLUSTERED HASH WITH (BUCKET_COUNT = 40000000) NOT NULL, [dt] datetime not null ) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA); GO use diskdb; GO CREATE TABLE dbo.Disk_tbl ( [ID] Int identity(1,1) Not null , [Data] uniqueidentifier PRIMARY KEY CLUSTERED, [dt] datetime not null ); GO
Performance Test Plan
The tests are to be performed on SQL Server 2014 SP1 and with a RAM of 16 GB. Here is the test plan
- Insert data in batches into HK_tbl (In-Memory OLTP table). The scripts would insert 40 million rows in 12 batches
- Delete random chunks of rows in 5 batches. Each batch deletes fixed number of rows ( 1000 rows ) at each iteration
- Issue Checkpoint
- Record the checkpoint timings using fn_dblog()
- Repeat the same above steps for Disk_tbl (disk table in diskdb)
Please note that to obtain correct results, tests have to be performed in sequence (Hekaton followed by Disk based table or other way around as well) and shouldn’t be performed in parallel.
The script provided below is used to load the data and trigger the checkpoint process. The script at first inserts 10,000 rows and then inserts 40 million rows in the next statement. Newid() function ensures the primary key values generated are in random order. Few delete statements to delete data at different sections of the table are also thrown in.
Print 'Start time: ' + convert(varchar,getdate(),113) use hkdb; GO Insert into HK_tbl(data,dt) Select NEWID(),getdate() GO 10000 Insert into HK_tbl(data,dt) Select newid(),getdate() from HK_tbl GO 12 delete HK_tbl where id between 1000 and 2000 delete HK_tbl where id between 25000 and 26000 delete HK_tbl where id between 555555 and 556555 delete HK_tbl where id between 6666666 and 6667666 delete HK_tbl where id between 333 and 1333 checkpoint; GO Print 'End time: ' + convert(varchar,getdate(),113)
A similar script is executed for the disk table as well.
Print 'Start time: ' + convert(varchar,getdate(),113) use DiskDB; GO Insert into [Disk_tbl](data,dt) Select NEWID(),getdate() GO 10000 Insert into [Disk_tbl](data,dt) Select newid(),getdate() from [Disk_tbl] GO 12 delete [Disk_tbl] where id between 1000 and 2000 delete [Disk_tbl] where id between 25000 and 26000 delete [Disk_tbl] where id between 555555 and 556555 delete [Disk_tbl] where id between 6666666 and 6667666 delete [Disk_tbl] where id between 333 and 1333 GO Checkpoint GO Print 'End time: ' + convert(varchar,getdate(),113)
The start and end times printed in the script are recorded on both the rounds, as they will be used to track checkpoint execution times. One point to note is that the intention of the test is not to test the performance of script execution, but the performance of the checkpoint processes.
Finding Checkpoint Duration
The undocumented function, fn_dblog, is used to read transaction logs is used to track checkpoint operation. fn_dblog function’s Operation column returns values “LOP_BEGIN_CKPT” , “LOP_END_CKPT”, which can be used to find the starting and finishing time of checkpoint processes. The script below will help us list the timings of all checkpoint processes that occurred during our testing.
SELECT f1.[Checkpoint Begin] , f1.[Current LSN] , f2.[Checkpoint End] , datediff(ms,f1.[Checkpoint Begin],f2.[Checkpoint End]) as duration FROM fn_dblog(NULL, NULL) f1 Inner Join fn_dblog(NULL, NULL) f2 On f1.[Current LSN] = f2.[Previous LSN] WHERE f2.Operation IN (N'LOP_BEGIN_CKPT', N'LOP_END_CKPT') and f1.[Checkpoint Begin] > convert(Datetime,'Start time' ) and f2.[Checkpoint End] < convert(Datetime, 'End time')
The start time and end time values are the one’s recorded during the performance tests. The script above needs to be run on both the databases. Date time filters are applied to capture the automatic checkpoints occurred during data loading and the manual checkpoint triggered in the script
Performance Data Comparison and Results
The results are tabulated below. As expected Hekaton tables outperform the disk based tables by a huge margin. Disk tables’ total checkpoint point time takes 54 seconds while In-Memory OLTP tables take less than 2 seconds. Hekaton tables checkpoint processes are almost 30 times faster.
Hekaton Table Results:
Duration in ms
Total time taken for checkpoint process
Disk table Results
Duration in ms
Total time taken for checkpoint process
In traditional disk based tables, the checkpoint process reads dirty pages and performs random I/Os on data files, as it has to write to defined positions in data files. However, for In-Memory OLTP tables, data files and delta files have only Streaming I/O, which implies that the writes are always append only and are written in chronological sequence. Each Data - Delta file pair have a begin timestamp and end timestamp, and the data file contains the rows that were inserted during that time period.
The following question may arise, inserts can be written in sequence but how are updates and deletes handled. Rows inserted in earlier time period (belonging to an old data and delta pair) can be deleted or updated. Updates are handled in the following way:
- New row with the updated values is inserted to the currently active / latest Data file
- Data - Delta file pair to which the old row belongs to is identified, and on the Delta file, a pointer to the old row in its corresponding data file is recorded. The pointer on the delta file indicates that the row in its corresponding data file has been deleted.
Delete operations are also handled in similar way. The above design ensures that a deletion / update of a row that was inserted at an earlier time period would still result in sequential I/O and not random I/O. This is the key reason for the performance improvements in the test.
Deletion of random rows and insertion of unordered data on clustered indexed column via Newid() function, during the performance tests , were done to compare the performance of checkpoint processes especially when there are random I/Os.
The tests show effectiveness of the new checkpoint processes on In-Memory OLTP tables. There are few other enhancements on checkpoint processes and many performance enhancements in general for In-Memory OLTP tables. It is always a challenge to demonstrate performance enhancements on internal processes; the above article is one such attempt to show the same.
As usual, I welcome our enthusiastic SQL Community to provide their valuable feedbacks and suggestions.