When INCLUDE Columns Quietly Inflate Your Transaction Logs

Introduction

As DBAs, we often think of INCLUDE columns as a harmless way to cover queries.They aren’t part of the index key, so it’s easy to assume that adding them wouldn’t have much impact.

This thinking is dangerously incomplete. While INCLUDE columns don’t participate in index order, they physically reside in index leaf rows. The silent impact of this design choice became crystal clear in a recent experiment I ran to test how INCLUDE columns affect transaction log generation during updates.

INCLUDE columns only: When you update columns that are only in the INCLUDE portion of the index (not part of the key), SQL Server doesn't need to maintain the index structure – it simply updates the leaf-level data. This generates log records, but not as much as key column updates.

Key column updates: If you update columns that are part of the index key, this requires more log activity because the index structure may need rebalancing. SQL Server logs a delete and insert for the affected index rows, including all INCLUDE columns, resulting in higher log volume.

Wider rows: Adding more INCLUDE columns makes index rows wider. As a result, whenever updates occur – even if they’re not on the INCLUDE columns themselves – slightly more data gets written to the transaction log because the overall index row structure is larger.

While it’s understandable that adding INCLUDE columns increases index size and, consequently, full backup sizes, what’s not so obvious is how it also causes transaction log sizes to grow. This indirect behaviour is the real focus of our discussion.

Setup: Building a Realistic Orders Table

In this setup, I first created a database named TestLogImpact and set it to FULL recovery mode to enable transaction log backups for precise measurement. I then switched to this database and dropped any existing Orders table to start clean. Next, I created a wide Orders table simulating a realistic ERP scenario with columns such as CustomerID, OrderDate, ShipDate, Quantity, Price, Status, Comments, Region, CreatedBy, UpdatedBy, and five additional VARCHAR(500) columns named Extra1 to Extra5 to simulate payload-heavy tables. Finally, I inserted 100,000 dummy rows into this table by selecting from a CROSS JOIN of sys.all_objects to generate sufficient data volume. Each row was populated with a random CustomerID between 0 and 999, current dates for OrderDate and ShipDate, random Quantity and Price values, and repeated strings for Comments and all Extra columns to ensure wide rows. This dataset created a practical yet performance-impactful baseline for testing how index INCLUDE columns affect transaction log generation during subsequent updates.

CREATE DATABASE TestLogImpact;

GO

ALTER DATABASE TestLogImpact SET RECOVERY FULL;

GO

USE TestLogImpact;

GO



DROP TABLE IF EXISTS Orders;

GO



CREATE TABLE Orders

(

    OrderID INT IDENTITY PRIMARY KEY,

    CustomerID INT,

    OrderDate DATE,

    ShipDate DATE,

    Quantity INT,

    Price DECIMAL(10,2),

    Status VARCHAR(20),

    Comments VARCHAR(500),

    Region VARCHAR(50),

    CreatedBy VARCHAR(50),

    UpdatedBy VARCHAR(50),

    Extra1 VARCHAR(500),

    Extra2 VARCHAR(500),

    Extra3 VARCHAR(500),

    Extra4 VARCHAR(500),

    Extra5 VARCHAR(500)

);

GO



INSERT INTO Orders (CustomerID, OrderDate, ShipDate, Quantity, Price, Status, Comments, Region, CreatedBy, UpdatedBy, Extra1, Extra2, Extra3, Extra4, Extra5)

SELECT TOP (100000)

    ABS(CHECKSUM(NEWID())) % 1000,

    GETDATE(),

    GETDATE(),

    ABS(CHECKSUM(NEWID())) % 100,

    CAST(ABS(CHECKSUM(NEWID())) % 5000 + 100 AS DECIMAL(10,2)),

    'Open',

    REPLICATE('Comment', 20),

    'North',

    'System',

    'System',

    REPLICATE('Extra1', 50),

    REPLICATE('Extra2', 50),

    REPLICATE('Extra3', 50),

    REPLICATE('Extra4', 50),

    REPLICATE('Extra5', 50)

FROM sys.all_objects a

CROSS JOIN sys.all_objects b;

GO

Step 1: Taking a Full Backup

Since log backups require a base full backup, I took one before proceeding.

BACKUP DATABASE TestLogImpact TO DISK = 'C:\Temp\TestLogImpact_full.bak' WITH INIT;

GO

Step 2: Measuring Log Generation Without INCLUDE

I backed up the log to reset it before testing.

BACKUP LOG TestLogImpact TO DISK = 'C:\Temp\log_before_noinclude.trn' WITH INIT;

GO

Then I created a nonclustered index without any INCLUDE columns.

CREATE NONCLUSTERED INDEX IX_CustomerID

ON Orders (CustomerID);

GO

I ran an update modifying Quantity values for customers between 100 and 900.

UPDATE Orders

SET Quantity = Quantity + 1

WHERE CustomerID BETWEEN 100 AND 900;

GO

Finally, I backed up the log after the update.

BACKUP LOG TestLogImpact TO DISK = 'C:\Temp\log_after_noinclude.trn' WITH INIT;

GO

Step 3: Measuring Log Generation With INCLUDE

Next, I dropped the previous index and recreated it with INCLUDE columns covering almost all other wide columns.

DROP INDEX IX_CustomerID ON Orders;

GO



CREATE NONCLUSTERED INDEX IX_CustomerID_INCLUDE

ON Orders (CustomerID)

INCLUDE (OrderDate, ShipDate, Quantity, Price, Status, Comments, Region, CreatedBy, UpdatedBy, Extra1, Extra2, Extra3, Extra4, Extra5);

GO

I backed up the log to reset it again.

BACKUP LOG TestLogImpact TO DISK = 'C:\Temp\log_before_include_update.trn' WITH INIT;

GO

Then I ran the same update as before.

UPDATE Orders

SET Quantity = Quantity + 1

WHERE CustomerID BETWEEN 100 AND 900;

GO

Finally, I took a log backup after this update.

BACKUP LOG TestLogImpact TO DISK = 'C:\Temp\log_after_include.trn' WITH INIT;

GO

Results: The Surprising Truth

Before discussing the result lets summarise the backup size in below table(Refer Image):

File Name	Size (KB)	Approx Size (MB)
TestLogImpact_full.bak	204,936 KB	~200 MB
log_before_index.trn	128 KB	~0.12 MB
log_after_noinclude.trn	10,500 KB	~10 MB
log_before_include_update.tn	177,388 KB	~173 MB
log_after_include.trn	299,780 KB	~293 MB

When I ran the update on the Orders table with just a simple nonclustered index without INCLUDE columns, the transaction log backup size was around 10 MB, as seen in log_after_noinclude.trn. That felt normal – the update touched many rows but the log backup was still within expected size for a table of this width and row count.

But when I ran the exact same update after creating a nonclustered index with INCLUDE columns covering almost every wide VARCHAR column in the table, the log backup size wasn’t just a bit bigger – it shot up to nearly 300 MB, as seen in log_after_include.trn. This means the same update generated around 30 times more transaction log volume just because the index contained those INCLUDE columns.

To be honest, I knew INCLUDE columns increase index size and can slightly increase log generation because of wider rows. But I did not expect a 10 MB update to become a 300 MB update purely due to INCLUDE usage. This isn’t about a few extra MBs you can ignore – it’s the difference between a smooth replication pipeline and one that lags badly because your log backups are hundreds of MB larger every hour.

It’s also worth noticing log_before_include_update.trn. This log backup was taken right after I created the nonclustered index with the INCLUDE columns – no data update had even happened yet. Just creating the index itself generated around 173 MB of transaction log, purely because the index included so many wide columns. That alone shows how heavy INCLUDE columns can be, even before your first update runs against them.

These results show why we cannot treat INCLUDE columns as harmless. They physically reside in index leaf rows, and any update that touches indexed data writes wider log records when INCLUDE columns are present. In environments with frequent updates and DR replication, this silent log bloat can directly impact backup storage, restore times, network costs, and RTO compliance.

Log backup sizes: INCLUDE vs no INCLUDE index

When INCLUDE Columns Quietly Inflate Your Transaction Logs

Introduction

Setup: Building a Realistic Orders Table

Step 1: Taking a Full Backup

Step 2: Measuring Log Generation Without INCLUDE

Step 3: Measuring Log Generation With INCLUDE

Results: The Surprising Truth

Why Does This Happen?

The Hidden Business Cost

Conclusion

Rate

Share

Categories

Share

Rate

When INCLUDE Columns Quietly Inflate Your Transaction Logs

Introduction

Setup: Building a Realistic Orders Table

Step 1: Taking a Full Backup

Step 2: Measuring Log Generation Without INCLUDE

Step 3: Measuring Log Generation With INCLUDE

Results: The Surprising Truth

Why Does This Happen?

The Hidden Business Cost

Conclusion

Rate

Share

Categories

Share

Rate

Related content

Deprecated but Forgotten: Why SQL Server’s Text, NText, and Image Data Types Still Haunt Your Systems

Advanced SQL Server Page Forensics: Detecting Page Splits and Allocations with DBCC PAGE

How to safely and surgically restore filegroups

How a Legacy Logic Choked SQL Server in a 30-Year-Old Factory

Time To Recharge