SQLServerCentral Article

We Gave Memory-Optimized Tables a Hash Lookup — Then Tried Pattern Matching Instead

,

Introduction

It was the week before Black Friday — the biggest online ad rush of the year. Our US-based ad-tech platform was gearing up for an insane traffic spike. Hundreds of real-time campaigns were about to go live across multiple brands, each with thousands of user sessions flowing through our system. Every incoming user impression had to be tagged, routed, and matched to the correct campaign session — within milliseconds. Latency wasn’t just a metric; it was a contractual obligation tied to revenue. Our backend needed to keep up with hundreds of thousands of lookups per second, each validating session data from a live campaign table that was growing rapidly by the hour.

To meet this challenge, we turned to SQL Server's memory-optimized tables, powered by a hash index on campaign_id. The idea was simple: move the hottest table fully into memory, design for O(1) key-based lookups, and contain the latency. 

That’s when we noticed something even more surprising: in some scenarios, our disk-based tables with proper nonclustered indexes were performing better than memory-optimized ones — especially when buffer pool caching kicked in.

But this failure sparked a new line of thinking—what if we could experiment with queries that don't utilize indexes due to patterns like campaign_id LIKE '%camp0%', which inherently bypass any index support? Poor query design was already recognized as technical debt, but the core idea was whether such workloads might actually perform better on memory-optimized tables. 

This article explores that very shift. We’re not here to claim memory-optimized tables are magic — but we did discover specific patterns where they still win, particularly under load or with large result sets. What started as a pursuit for speed turned into a deeper story about trade-offs, performance behavior, and how SQL Server’s architecture reacts when you stretch it in unexpected ways.

Simulation: Let’s Recreate What Went Wrong

Create a database which could support memory optimised tables

This script first switches to the master database, then drops the CampaignDB if it already exists to ensure a clean slate. It creates a new database called CampaignDB and then adds a special filegroup named MemOptFileGroup to support memory-optimized tables — which require a separate memory-optimized data filegroup. Finally, it adds a physical file (MemOptFile) to that filegroup and points it to a path on disk (C:\Data\MemOptFile). Without this setup, SQL Server won't allow you to create memory-optimized tables with MEMORY_OPTIMIZED = ON

USE master;
GO
DROP DATABASE IF EXISTS CampaignDB;
GO
CREATE DATABASE CampaignDB;
GO
ALTER DATABASE CampaignDB 
ADD FILEGROUP MemOptFileGroup CONTAINS MEMORY_OPTIMIZED_DATA;
GO
ALTER DATABASE CampaignDB ADD FILE (
NAME = N'MemOptFile',
FILENAME = N'C:\Data\MemOptFile'
) TO FILEGROUP MemOptFileGroup;
GO

 

Here’s how we defined the table:

Create Disk based and Memory optimised tables for compare

We first switch to our CampaignDB and drop any existing versions of the tables to ensure a clean setup. The first table, CampaignDisk, is a regular disk-based table. It has four columns: an auto-incrementing primary key (id), a campaign_id to identify the ad campaign, a session_id (stored as a UNIQUEIDENTIFIER), and a timestamp created_at which defaults to the current system time. After creating the table, we add a nonclustered index on campaign_id to support fast searches and filtering using this column — especially helpful for queries using = or LIKE 'prefix'.

Next, we create a memory-optimized version of the same table, called CampaignMem. It includes similar columns but is explicitly marked with MEMORY_OPTIMIZED = ON, meaning it will live in SQL Server’s in-memory OLTP engine. The primary key is defined as a nonclustered index, and we add a hash index on campaign_id using BUCKET_COUNT = 100000. Hash indexes are designed for O(1) key lookups — super fast if you use = filters, but useless for pattern matching. We also set DURABILITY = SCHEMA_AND_DATA, which ensures data is persisted across restarts — not just the structure. This setup lets us compare performance between disk and in-memory tables.

----Create Disk Based Tables----
USE CampaignDB;
GO
DROP TABLE IF EXISTS dbo.CampaignDisk;
GO
CREATE TABLE dbo.CampaignDisk (
    id INT IDENTITY PRIMARY KEY,
    campaign_id VARCHAR(50) NOT NULL,
    session_id UNIQUEIDENTIFIER,
    created_at DATETIME2 DEFAULT SYSDATETIME()
);
GO

-- Add index
CREATE NONCLUSTERED INDEX IX_CampaignDisk_CampaignId
ON dbo.CampaignDisk(campaign_id);

---Create Memory optimised tables---
DROP TABLE IF EXISTS dbo.CampaignMem;
GO
CREATE TABLE dbo.CampaignMem (
    id INT NOT NULL PRIMARY KEY NONCLUSTERED,
    campaign_id VARCHAR(50) NOT NULL,
    session_id UNIQUEIDENTIFIER NOT NULL,
    created_at DATETIME2 DEFAULT SYSDATETIME(),
    INDEX IX_CampaignMem_HashCampaignId HASH (campaign_id) 
        WITH (BUCKET_COUNT = 100000)
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);

 

 

We created a memory-optimized table with a hash index on campaign_id to achieve lightning-fast lookups — aiming for O(1) performance. To minimize collisions and ensure even data distribution, we carefully selected a high bucket count of 1,048,576. The table was configured with DURABILITY = SCHEMA_AND_DATA, so both data and structure would persist across server restarts. Initially, the setup performed brilliantly — queries were blazing fast, and we were confident we’d nailed the performance challenge. But that success was short-lived. As query patterns evolved, especially with more flexible search requirements, things started to break. Here's how you can simulate the exact scenario we encountered.

Insert Identical Random Data in Both Tables

To test lookup performance between the disk-based and memory-optimized tables, we started by inserting the same set of randomized data into both. For this, we used the system table master..spt_values, which contains a large number of rows ideal for quick simulations. By cross-applying it with itself and limiting each side to 10 rows, we generated a dataset of approximately 10 × 10 = 100 rows.

Each row's campaign_id was generated as a pseudo-random 5-digit number prefixed with 'cmp_', ensuring diversity while following a consistent format. The CHECKSUM(NEWID()) % 100000 part ensured the IDs stayed within a 5-digit range, and we padded the result with leading zeroes for uniformity. The session_id was filled using NEWID() to mimic unique session tracking, just like in real-world ad campaign data.

For the memory-optimized table (CampaignMem), we explicitly assigned an id using ROW_NUMBER() since identity columns can be tricky in memory-optimized tables. This allowed us to mimic the same structure as the disk-based table while keeping inserts deterministic.

Once both tables were loaded, we ran a basic performance test. We declared two timers (@elapsedDiskMs and @elapsedMemMs) and executed 10 repeated lookups for a fixed campaign_id ('CAMP0') in both tables. This gave us a realistic view of how each table performs under a read-heavy workload. The idea was to see whether our memory-optimized setup truly gave us an edge — or if SQL Server’s buffer pool and disk caching would surprise us.

 

-- Disk Table
 INSERT INTO dbo.CampaignDisk (campaign_id, session_id) 
SELECT 'cmp_' + RIGHT('00000' + CAST(ABS(CHECKSUM(NEWID())) % 100000 AS VARCHAR), 5),
 NEWID() FROM master..spt_values a
 CROSS APPLY master..spt_values b 
WHERE a.type='P' AND b.type='P' AND a.number < 10 AND b.number < 10; 

-- Memory Table 
INSERT INTO dbo.CampaignMem (id, campaign_id, session_id) 
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)), 'cmp_' + RIGHT('00000' + CAST(ABS(CHECKSUM(NEWID())) % 100000 AS VARCHAR), 5), 
NEWID() FROM master..spt_values a 
CROSS APPLY master..spt_values b 
WHERE a.type='P' AND b.type='P' AND a.number < 10 AND b.number < 10;

 

Step 3: Lookup Performance Test

To measure and compare the lookup performance of the two table types, we wrapped each test block in a timer using SYSDATETIME() and calculated the elapsed time with DATEDIFF. We first declared two variables, @elapsedDiskMs and @elapsedMemMs, to store the time taken (in milliseconds) by each table type.

In the first block, we tested the disk-based table. We started the timer, then ran a loop 10 times to simulate repeated lookups for a specific campaign_id — in this case, 'CAMP0'. Each iteration executed a simple SELECT 1 statement (to minimize result processing overhead) using a WHERE clause that filters by exact match. After the loop, we captured the end time and calculated the elapsed duration.

Next, we repeated the same logic for the memory-optimized table, CampaignMem. Again, we ran 10 iterations of the same lookup query, using a WHERE clause with the same campaign_id = 'CAMP00095'. This allowed us to directly compare performance under identical conditions — same data, same logic, same server.

Finally, we used PRINT statements to output the total time taken by each test block. This approach provides a clean, repeatable performance comparison between the two storage engines. It helps validate whether the hash-based memory-optimized table truly delivers faster lookups than a disk-based table — or whether caching and index seeks make the disk table surprisingly competitive.

 

-- Declare timers
DECLARE @startTime DATETIME2, @endTime DATETIME2, @elapsedDiskMs BIGINT, @elapsedMemMs BIGINT;

-- =======================
-- Disk-based table timing
-- =======================
SET @startTime = SYSDATETIME();

DECLARE @i INT = 0;
WHILE @i < 10
BEGIN
    SELECT 1 FROM dbo.CampaignDisk WHERE campaign_id = 'CAMP00095';
    SET @i += 1;
END;

SET @endTime = SYSDATETIME();
SET @elapsedDiskMs = DATEDIFF(MILLISECOND, @startTime, @endTime);

-- ==============================
-- Memory-optimized table timing
-- ==============================
SET @startTime = SYSDATETIME();

SET @i = 0;
WHILE @i < 10
BEGIN
    SELECT 1 FROM dbo.CampaignMem WHERE campaign_id= 'CAMP00095';
    SET @i += 1;
END;

SET @endTime = SYSDATETIME();
SET @elapsedMemMs = DATEDIFF(MILLISECOND, @startTime, @endTime);

-- ================
-- Final Results
-- ================
PRINT 'Total Time (Disk-Based Table): ' + CAST(@elapsedDiskMs AS VARCHAR) + ' ms';
PRINT 'Total Time (Memory-Optimized Table): ' + CAST(@elapsedMemMs AS VARCHAR) + ' ms';

And now observe…

The Shocker: Disk Table Was Faster for Direct Lookup?

We expected our memory-optimized table with a hash index to dominate the performance charts.

But here's what we saw in reality:

The screenshot shows a performance benchmark comparing 200 lookups for a fixed campaign_id = 'CAMP00095' across both disk-based and memory-optimized tables. The results are unexpected: the disk-based table completed all lookups in 1906 ms, while the memory-optimized table took 3328 ms — nearly 75% slower. This contradicts the common assumption that memory-optimized tables are always faster. However, the reason lies in SQL Server’s buffer pool caching. After the first lookup, the disk-based table’s pages were loaded into memory, and subsequent lookups benefited from fast in-RAM access and index seek capabilities. In contrast, the memory-optimized table performed a full scan each time due to the hash index not supporting LIKE or partial lookups. Despite being fully in memory, this results in higher CPU usage and slower throughput for repetitive, non-exact lookups. This test clearly shows that in-memory doesn't always mean better—especially when caching and indexes are effectively used on disk-based tables.

Why Did the Disk Table Win?

Reason 1: Buffer Pool Caching is Incredibly Powerful

One of the key reasons the disk-based table outperformed memory-optimized was SQL Server’s buffer pool. After the first access, SQL Server loads the required data pages into memory. This means any repeat queries—even full table scans—can execute extremely fast because the data is already cached in RAM. In our case, the lookup was always for the same value, campaign_id = 'CAMP0', so the necessary pages remained hot in memory after the first iteration. As a result, the subsequent lookups flew by in microseconds. However, it’s important to note that this caching advantage is not guaranteed forever. The buffer pool is volatile — significant data changes, memory pressure, or concurrent workloads can evict these cached pages, forcing the system to hit the disk again and degrading performance.

Reason 2: Index Seek on Disk = Sweet Spot

Another factor in the disk table’s favor was its nonclustered index on campaign_id. Even with a query like LIKE 'CAMP%', SQL Server can still partially use this index for range scans, resulting in efficient access paths. In contrast, memory-optimized tables rely on hash indexes, which only work with exact = matches. Any usage of LIKE, even with a simple prefix, causes a full table scan in memory, consuming more CPU and losing the O(1) performance benefit.

Ultimately, we observed that when queries use indexed filters and return small result sets, disk-based tables with caching and indexing often outperform memory-optimized ones.

Now comes the twist — what happens when the business starts asking for flexible pattern matches instead of exact lookups?

 

Now Comes the Twist: Pattern Matching

We come across in our workloads there were many queries from developers and also business required the regular expression pattern matching like below:

SELECT * FROM dbo.CampaignDisk WHERE campaign_id LIKE '%CAMP0% or campaign_id LIKE '%AMP0%';

 

And we thought to check this kind of workload on memory tables. That’s no longer a prefix match anymore which could be supported by index seek or hash lookup. 

Indexes Don’t Help You Now

When queries involve patterns like LIKE '%xyz%' or more complex regex-style matching, neither the disk-based nonclustered index nor the memory-optimized hash index can help. These patterns are not SARGable, meaning SQL Server can’t use the index to narrow down the search. As a result, both types of tables fall back to full table scans. But in this scenario, the advantage shifts — memory-optimized tables begin to shine. With all data already in RAM and accessed natively without latching or I/O, they can scan large volumes of rows more efficiently, especially when compared to disk tables that might still incur page reads or buffer pool churn.

 

Lets run below query with modified pattern search instead of a straight lookup and see how it behaves differently

-- Declare timers
DECLARE @startTime DATETIME2, @endTime DATETIME2, @elapsedDiskMs BIGINT, @elapsedMemMs BIGINT;

-- =======================
-- Disk-based table timing
-- =======================
SET @startTime = SYSDATETIME();

DECLARE @i INT = 0;
WHILE @i < 10
BEGIN
    SELECT 1 FROM dbo.CampaignDisk WHERE campaign_id like '%CAMP0%';
    SET @i += 1;
END;

SET @endTime = SYSDATETIME();
SET @elapsedDiskMs = DATEDIFF(MILLISECOND, @startTime, @endTime);

-- ==============================
-- Memory-optimized table timing
-- ==============================
SET @startTime = SYSDATETIME();

SET @i = 0;
WHILE @i < 10
BEGIN
    SELECT 1 FROM dbo.CampaignMem WHERE campaign_id like '%CAMP0%';
    SET @i += 1;
END;

SET @endTime = SYSDATETIME();
SET @elapsedMemMs = DATEDIFF(MILLISECOND, @startTime, @endTime);

-- ================
-- Final Results
-- ================
PRINT 'Total Time (Disk-Based Table): ' + CAST(@elapsedDiskMs AS VARCHAR) + ' ms';
PRINT 'Total Time (Memory-Optimized Table): ' + CAST(@elapsedMemMs AS VARCHAR) + ' ms';

This script compares how long it takes to perform only 10 repeated pattern-based lookups in loop as result set is generally huge compared to lookups.  Using LIKE '%CAMP0%' on both the disk-based and memory-optimized versions of the Campaign table. First, it declares timing variables to measure and compare the duration of each test block. The query starts by capturing the system time using SYSDATETIME() just before entering a loop that runs 10 times. Inside the loop, it executes a SELECT 1 from the disk-based table (CampaignDisk) using a LIKE filter with a leading wildcard (%CAMP0%), which prevents the query engine from using any indexes. After the loop completes, it records the end time and calculates how long the block took in milliseconds.

The same logic is then repeated for the memory-optimized table (CampaignMem). Again, 10 identical queries using LIKE '%CAMP0%' are executed, and the elapsed time is calculated.

Finally, both execution times are printed out. This test is meant to simulate real-world pattern search workloads and demonstrates how both table types behave when index support is effectively disabled. In such cases, memory-optimized tables often perform better because full table scans in memory are faster than on-disk I/O, especially when no indexes can be leveraged.

We got below results after running pattern based lookups in memory and disk tables:

The results in the screenshot clearly show that the memory-optimized table outperformed the disk-based one during a pattern-matching workload using LIKE '%CAMP0%'. While the disk table took 28,141 ms, the memory-optimized table finished the same 10-query block in 9,766 ms — roughly three times faster. That said, the performance boost may not seem revolutionary, especially when measured in milliseconds. But the real win here isn’t just about speed — it’s about architectural resilience which provide other indirect benefits. 

Memory-optimized tables come with a built-in advantage: they are completely latch- and lock-free. In high-concurrency systems, this means your scans won’t get blocked by other reads or writes. Disk-based tables, on the other hand, can experience page latches, locks, and tempdb spillovers during large scans or under pressure. So even if raw speed isn't drastically better, memory-optimized tables often provide more consistent and predictable behavior under load.

Of course, these tables aren't perfect. If there are Hash indexes, it  can suffer from bucket skewing, especially if the distribution of your key column isn’t uniform. Additionally, memory-optimized tables are subject to garbage collection delays, which can impact write-heavy workloads or lead to unpredictable pauses if not monitored.

Still, with proper sizing, indexing, and memory budgeting, these pain points can be managed. In workloads where pattern scans are frequent and concurrency is high, memory-optimized tables not only perform well — they also bring stability to the table. This makes them a solid choice when you need more than just raw speed.

Why Memory-Optimized Wins in Pattern Search

When it comes to pattern-based searches, especially those that bypass index usage like LIKE '%pattern%', memory-optimized tables start to show their real strength. Since all data resides entirely in RAM, there’s no disk I/O or page latching involved during scans. Unlike disk-based tables, memory-optimized ones don’t rely on the buffer pool — their data is always hot and instantly accessible. Even when queries return large result sets or run under concurrent load, memory-optimized tables tend to handle full scans more gracefully and consistently, making them a strong contender for workloads involving heavy pattern matching or flexible search conditions.

 

Takeaway: Use Case Determines the Winner

 

ScenarioWinner
Exact match (=) with cachingDisk-based (sometimes faster)
Pattern match (LIKE '%xyz%')Memory-optimized
High write throughputMemory-optimized
Query uses index + low change frequencyDisk-based (due to cache reuse)
Large result set scanningMemory-optimized (parallelism helps)

Bonus Insight: Disk Cache Can Be a Trap

While disk-based tables can perform beautifully once the buffer pool is warmed up, this advantage is fragile. If the dataset grows large or the table experiences frequent updates, SQL Server may evict those cached pages to make room for new ones. When that happens, queries that were blazing fast suddenly start hitting disk again, and performance drops sharply. In such cases, memory-optimized tables — which operate entirely in RAM and don’t rely on the buffer pool — begin to look far more appealing for consistent performance under pressure.

 

Summary

We started with sub-millisecond lookups as our target and chose memory-optimized tables for that promise. Yes, we hit a wall when the query pattern changed. But repurposing that same memory-optimized table for pattern matching workloads actually gave us better throughput and stability under load.

That said, we’re not suggesting you blindly migrate all your regex or pattern search workloads to memory-optimized tables. These tables come with their own set of trade-offs — including sizing limitations, garbage collection behavior, and indexing constraints. Instead, consider this a blueprint: evaluate your specific workloads, observe how your queries perform with different access patterns, and carefully weigh the pros and cons.

It's crucial to clearly distinguish which memory-optimized tables should be durable (i.e., SCHEMA_AND_DATA) and which can remain non-durable. Ensure you have a solid backup, high availability (HA), and disaster recovery (DR) plan in place for memory-optimized tables. Always benchmark and profile their performance against the corresponding disk-based tables, using stress tests to guide your decisions.

When planning such migrations, begin with low-criticality tables to minimize risk. Microsoft generally recommends targeting high-concurrency workloads, temporary or staging tables, IoT ingestion pipelines, and lookup/reference tables used in joins as ideal candidates for memory optimization.

Check how much memory you can realistically allocate without affecting other parts of your SQL Server instance. Start in testing or dev environments, simulate real-world usage patterns, and monitor performance under pressure. Memory-optimized tables are powerful — but only when used wisely and for the right workloads. What worked for us may not work out-of-the-box for everyone — but with careful benchmarking and planning, you might find surprising wins hiding behind those %LIKE% scans.

 

Rate

4.83 (6)

You rated this post out of 5. Change rating

Share

Share

Rate

4.83 (6)

You rated this post out of 5. Change rating