SQLServerCentral Article

Reading SQL Server's Transaction Log

,

Introduction

So you want to read the transaction log? The reasons for doing so are many and varied, so let me start by explaining my reasons. However, before I do let me first recognize some of the people who have helped me get to this point and apologize to any I may have missed. Thank you - Paul S. Randal, Kimberly Tripp, Nigel Rivett, Kalen Delaney, Rick Negrin, Isaac Kunen, Danny Gould and Kevvie Fowler.

Many years ago I read this article about MGM Mirage's near real-time data warehouse using SQL Server 2000 (Nov. 2000). After reading it, I was determined to do the same thing that MGM and SRD had achieved. Considering that if they had already done it, there must be a way, I just needed to find it. So I researched the process SRD was using to "capture" changes, which lead me to reading the Transaction Log. So I spent the years following reading articles, blogs and fragments by the fore mentioned individuals gleaning how to read the "active" portion of the log and "crack" the record contents.

"cracking records manually is fun (if you're twisted as well as curious like I am :-)" Paul S. Randall http://blogs.msdn.com/sqlserverstorageengine/archive/2006/08/09/692806.aspx)

Thus, the development of the process for decoding the log had begun. But enough talk, let's crack.

Getting Started

Use the script below to get started:

CREATE DATABASE [Crack_Me];
GO
USE Crack_Me;
GO
CREATE TABLE [dbo].[Crack_Me_If_You_Can](
 [ID] [int] PRIMARY KEY IDENTITY NOT NULL,
 [Insert_Date] [datetime] NOT NULL,
 [Some_Data] [varchar](100) NOT NULL,
 [Optional_Data] [varchar](50)NULL,
 [Life_the_Universe_and_Everything] [int] NOT NULL,
);
GO
INSERT INTO [Crack_Me_If_You_Can]
(
 Insert_Date,
 Some_Data,
 Optional_Data, 
 Life_the_Universe_and_Everything
)
VALUES (GetDate(), 'Don''t Panic', 'Share and Enjoy', 42);
GO
SELECT * FROM Crack_Me_If_You_Can;
GO

Now that we've got some data to look at, we'll use one of the most blogged about formally undocumented functions in the world of Microsoft to view the records in the transaction log.

SELECT * FROM fn_dblog(NULL, NULL)

However, one glance at what is returned and you'll quickly find that there is a lot more going on in the log than the Insert you've just completed. To narrow the results to only what we're interested in we'll need to find the AllocUnitId. Think of the AllocUnitId as an identity for the instance of the "Crack_Me_If_You_Can" table. I say instance, because the AllocUnitId will change with significant schema changes.

The following query will list all the objects created in the database with an object type of 'user'.

USE Crack_Me;
GO
SELECT allocunits.allocation_unit_id, objects.name, objects.id
FROM sys.allocation_units allocunits
INNER JOIN sys.partitions partitions ON (allocunits.type IN (1, 3) 
 AND partitions.hobt_id = allocunits.container_id)
 OR (allocunits.type = 2 and partitions.partition_id = allocunits.container_id)
INNER JOIN sysobjects objects ON partitions.object_id = objects.id
 AND objects.type IN ('U', 'u')
WHERE partitions.index_id IN (0, 1)

My particular instance returned allocation_unit_id (72057594039697408) for the table "Crack_Me_If_You_Can", your instance will return something similar.

SELECT * FROM fn_dblog(NULL, NULL)
WHERE AllocUnitId = 72057594039697408
AND Operation = 'LOP_INSERT_ROWS'

Now we're getting somewhere. Narrowing the search to just our table yields about five rows to look through; further narrowing the search to "LOP_INSERT_ROWS'" yields the one record we want to look at. I've taken the liberty of further eliminating the signal from the noise with the next query statement. The following statement gives all the fields relevant to us and unifies the previous query statements.

DBCC TRACEON(2537)
SELECT
 [Current LSN], 
 Operation,
 dblog.[Transaction ID],
 AllocUnitId,
 AllocUnitName,
 [Page ID],
 [Slot ID],
 [Num Elements],
 dblog1.[Begin Time],
 dblog1.[Transaction Name],
 [RowLog Contents 0],
 [Log Record]
FROM ::fn_dblog(NULL, NULL) dblog
 INNER JOIN 
 ( 
 SELECT allocunits.allocation_unit_id, objects.name, objects.id
 FROM sys.allocation_units allocunits 
 INNER JOIN sys.partitions partitions ON (allocunits.type IN (1, 3) 
 AND partitions.hobt_id = allocunits.container_id) 
 OR (allocunits.type = 2 and partitions.partition_id = allocunits.container_id) 
 INNER JOIN sysobjects objects ON partitions.object_id = objects.id
 AND objects.type IN ('U', 'u')
 WHERE partitions.index_id IN (0, 1)
 ) allocunits ON dblog.AllocUnitID = allocunits.allocation_unit_id 
 INNER JOIN 
 (
 SELECT [Begin Time],[Transaction Name],[Transaction ID]
 FROM fn_dblog(NULL, NULL) x
 WHERE Operation = 'LOP_BEGIN_XACT' 
 ) dblog1 ON dblog1.[Transaction ID] = dblog.[Transaction ID]
WHERE [Page ID] IS NOT NULL AND [Slot ID] >= 0
 AND dblog.[Transaction ID] != '0000:00000000' 
 AND Context in ('LCX_HEAP', 'LCX_CLUSTERED')
DBCC TRACEOFF(2537)

Exploring the Page

On a side note take a look at the values in [Page ID] and [Slot ID], mine are "0001:00000091" and "0" respectively. Insert your [Page ID] value in place of my @pageID$ and run the following query script.

DECLARE @pageID$ NVARCHAR(23), @pageID NVARCHAR(50), @sqlCmd NVARCHAR(4000);
SET @pageID$ = '0001:00000091'
SELECT @pageID = 
CONVERT(VARCHAR(4), CONVERT(INT, CONVERT(VARBINARY,
SUBSTRING(@pageID$, 0, 5), 2)))
+ ',' +
CONVERT(VARCHAR(8), CONVERT(INT, CONVERT(VARBINARY,
SUBSTRING(@pageID$, 6, 8), 2)))
SET @sqlCmd = 'DBCC PAGE (''Crack_Me'',' + @pageID + ',3) WITH TABLERESULTS'
EXECUTE(@sqlCmd)

Scroll down until you get to your [Slot ID] value. In the [Field] and [Value] columns you'll find the column names and values that you previously inserted. At this point you might be asking "is all that necessary to view the data we inserted? Or why not just use this method all the time, why explore any further?"

Two reasons: First, on a clustered table the data doesn't always remain in the same [PAGE ID] and [Slot ID], it's constantly being reshuffled and reorganized to optimize space on the Page. In a database such as ours where no other activity is going on this reshuffling isn't noticeable, but in an active database there's no telling when and how often it will occur. However, if our table had been a HEAP this method works nicely because the data remains with the [Page ID] and [Slot ID] for the life of the HEAP. The second reason, it that calling 'DBCC PAGE' for every record we need to lookup can be a constant drain on the SQL Server, and should be used sparingly.

Record Structure

Getting back to the main topic; grab the value from [RowLog Contents 0] from our fn_dblog query. It should be...

0x3000140001000000E10CF400EA9D00002A000000050000020028003700446F6E2774205061

6E6963536861726520616E6420456E6A6F79

To understand how the record is structured have a look at the following table. Similar structure listings can be found in postings from Paul S. Randal and Nigel Rivett, or SQL Server Database Forensics by Kevvie Fowler.

The colorized items are standard in every record that is an INSERT and comprised of FIXED and VARIABLE length columns. If only fixed data existed, the variable values would not be present and the first byte value 0x30 would also differ. Armed with this information, I've created an over-simplified query statement that breaks out and displays these items from our [RowLog Contents 0 ] value.

DECLARE @RowLogContents VARBINARY(8000)
SET @RowLogContents = 0x3000140001000000E10CF400EA9D00002A000000050000020028003700446F6E27742050616E6963536861726520616E6420456E6A6F79
DECLARE @lenFixedBytes SMALLINT, @noOfCols SMALLINT, @nullBitMapLength SMALLINT, @nullByteMap VARBINARY(MAX), @nullBitMap VARCHAR(MAX), @noVarCols SMALLINT, @columnOffsetArray VARBINARY(MAX), @varColPointer SMALLINT
SELECT 
 @lenFixedBytes = CONVERT(SMALLINT, CONVERT(BINARY(2), REVERSE(SUBSTRING(@RowLogContents, 2 + 1, 2)))),
 @noOfCols = CONVERT(INT, CONVERT(BINARY(2), REVERSE(SUBSTRING(@RowLogContents, @lenFixedBytes + 1, 2)))),
 @nullBitMapLength = CONVERT(INT, ceiling(@noOfCols/8.0)),
 @nullByteMap = SUBSTRING(@RowLogContents, @lenFixedBytes + 3, @nullBitMapLength),
 @noVarCols = CASE WHEN SUBSTRING(@RowLogContents, 1, 1) = 0x30 THEN
 CONVERT(INT, CONVERT(BINARY(2), REVERSE(SUBSTRING(@RowLogContents, @lenFixedBytes + 3 + @nullBitMapLength, 2))))
 ELSE null
 END, 
 @columnOffsetArray = CASE WHEN SUBSTRING(@RowLogContents, 1, 1) = 0x30 THEN
 SUBSTRING(@RowLogContents, @lenFixedBytes + 3 + @nullBitMapLength + 2, @noVarCols * 2)
 ELSE null
 END,
 @varColPointer = CASE WHEN SUBSTRING(@RowLogContents, 1, 1) = 0x30 THEN 
 (@lenFixedBytes + 2 + @nullBitMapLength + 2 + (@noVarCols * 2))
 ELSE null
 END
DECLARE @byteTable TABLE
(
 byte INT
)
DECLARE @cnt INT 
SET @cnt = 1
WHILE (@cnt < @nullBitMapLength + 1)
BEGIN
 INSERT INTO @byteTable(byte)
 VALUES(@cnt)
 SET @cnt = @cnt +1
END
SELECT 
 @nullBitMap = COALESCE(@nullBitMap, '') + 
 CONVERT(NVARCHAR(1), (SUBSTRING(@nullByteMap, byte, 1) / 128) % 2) + 
 CONVERT(NVARCHAR(1), (SUBSTRING(@nullByteMap, byte, 1) / 64) % 2) +
 CONVERT(NVARCHAR(1), (SUBSTRING(@nullByteMap, byte, 1) / 32) % 2) +
 CONVERT(NVARCHAR(1), (SUBSTRING(@nullByteMap, byte, 1) / 16) % 2) +
 CONVERT(NVARCHAR(1), (SUBSTRING(@nullByteMap, byte, 1) / 8) % 2) +
 CONVERT(NVARCHAR(1), (SUBSTRING(@nullByteMap, byte, 1) / 4) % 2) +
 CONVERT(NVARCHAR(1), (SUBSTRING(@nullByteMap, byte, 1) / 2) % 2) +
 CONVERT(NVARCHAR(1), SUBSTRING(@nullByteMap, byte, 1) % 2) 
FROM @byteTable b 
ORDER BY byte DESC
SELECT 
 SUBSTRING(@RowLogContents, 2 + 1, 2) AS lenFixedBytes,
 SUBSTRING(@RowLogContents, @lenFixedBytes + 1, 2) AS noOfCols,
 SUBSTRING(@RowLogContents, @lenFixedBytes + 3, @nullBitMapLength) AS nullByteMap,
 SUBSTRING(@RowLogContents, @lenFixedBytes + 3 + @nullBitMapLength, 2) AS noVarCols,
 SUBSTRING(@RowLogContents, @lenFixedBytes + 3 + @nullBitMapLength + 2, @noVarCols * 2) AS columnOffsetArray,
 @lenFixedBytes + 2 + @nullBitMapLength + 2 + (@noVarCols * 2) AS varColStart
SELECT 
 @lenFixedBytes AS lenFixedBytes, 
 @noOfCols AS noOfCols, 
 @nullBitMapLength AS nullBitMapLength,
 @nullByteMap AS nullByteMap,
 @nullBitMap AS nullBitMap,
 @noVarCols AS noVarCols,
 @columnOffsetArray AS columnOffsetArray,
 @varColPointer AS varColStart
DECLARE @colOffsetTable TABLE
(
 colNum SMALLINT,
 columnOffset VARBINARY(2),
 columnOffvalue SMALLINT,
 columnLength SMALLINT
)
SET @cnt = 1
WHILE (@cnt <= @noVarCols)
BEGIN
 INSERT INTO @colOffsetTable(colNum, columnOffset, columnOffValue, columnLength)
 VALUES(
 @cnt * - 1, 
 SUBSTRING (@columnOffsetArray, (2 * @cnt) - 1, 2), 
 CONVERT(SMALLINT, CONVERT(BINARY(2), REVERSE (SUBSTRING (@columnOffsetArray, (2 * @cnt) - 1, 2)))), 
 CONVERT(SMALLINT, CONVERT(BINARY(2), REVERSE (SUBSTRING (@columnOffsetArray, (2 * @cnt) - 1, 2))))
 - ISNULL(NULLIF(CONVERT(SMALLINT, CONVERT(BINARY(2), REVERSE (SUBSTRING (@columnOffsetArray, (2 * (@cnt - 1)) - 1, 2)))), 0), @varColPointer)
 )
 SET @cnt = @cnt + 1
END
SELECT * FROM @colOffsetTable

You might notice that this statement converts the nullByteMap into bit values; had there been any NULLs inserted with our data there would have been a 1 in the respective column's position. Also the columnOffsetArray is broken into a table with respective starting byte and length for each variable value. I won't delve into the rest of the items here; the query allows you to see what each item translates into and it's respective hexadecimal value. That combined with the preceding table should provide enough insight into each item's location and purpose, what is still needed is a way to determine how the data values themselves are laid out and which segments are what.

The following query statement will give us exactly that!

SELECT
 cols.leaf_null_bit AS nullbit,
 ISNULL(syscolumns.length, cols.max_length) AS [length],
 CASE 
 WHEN is_uniqueifier = 1 THEN 'UNIQUIFIER'
 ELSE ISNULL(syscolumns.name, 'DROPPED')
 END [name],
 cols.system_type_id,
 cols.leaf_bit_position AS bitpos,
 ISNULL(syscolumns.xprec, cols.precision) AS xprec,
 ISNULL(syscolumns.xscale, cols.scale) AS xscale,
 cols.leaf_offset,
 is_uniqueifier
FROM sys.allocation_units allocunits
INNER JOIN sys.partitions partitions ON (allocunits.type IN (1, 3)
 AND partitions.hobt_id = allocunits.container_id) 
 OR(allocunits.type = 2 AND partitions.partition_id =allocunits.container_id)
INNER JOIN sys.system_internals_partition_columns cols ON
 cols.partition_id = partitions.partition_id
LEFT OUTER JOIN syscolumns ON syscolumns.id = partitions.object_id
 AND syscolumns.colid = cols.partition_column_id
WHERE allocunits.allocation_unit_id = 72057594039697408
ORDER BY nullbit

This query gives us the order of all the columns (leaf_null_bit) with regard to the transaction log, it also provides the lengths for our fixed length data items (leaf_offset > 0), you'll recall we already obtained the lengths of the variable length data items from the columnOffsetArray.

Inserting this query statement below our previous query statements coupled with adding placeholders for the first bytes of meta-data along and making use of the @nullBitMap, gives us a unified way to segment all the fixed and variable data items.

DECLARE @schema TABLE 
(
 [column] INT,
 [length] INT,
 [name] NVARCHAR(255),
 [system_type_id] INT,
 [bitpos] INT,
 [xprec] INT,
 [xscale] INT,
 [leaf_offset] INT,
 [is_uniqueifier] BIT,
 [is_null] BIT NULL
)
INSERT INTO @schema
SELECT
cols.leaf_null_bit AS nullbit,
ISNULL(syscolumns.length, cols.max_length) AS [length], 
CASE 
WHEN is_uniqueifier = 1 THEN 'UNIQUIFIER'
ELSE isnull(syscolumns.name, 'DROPPED')
END [name], 
cols.system_type_id,
cols.leaf_bit_position AS bitpos, 
ISNULL(syscolumns.xprec, cols.precision) AS xprec, 
ISNULL(syscolumns.xscale, cols.scale) AS xscale,
cols.leaf_offset,
is_uniqueifier,
SUBSTRING(REVERSE(@nullBitMap), cols.leaf_null_bit, 1) AS is_null
FROM sys.allocation_units allocunits 
INNER JOIN sys.partitions partitions ON (allocunits.type IN (1, 3) 
 AND partitions.hobt_id = allocunits.container_id) OR (allocunits.type = 2 AND partitions.partition_id = allocunits.container_id)
INNER JOIN sys.system_internals_partition_columns cols ON cols.partition_id = partitions.partition_id
LEFT OUTER JOIN syscolumns ON syscolumns.id = partitions.object_id 
 AND syscolumns.colid = cols.partition_column_id
WHERE allocunits.allocation_unit_id = 72057594039697408
ORDER BY nullbit
INSERT INTO @schema
SELECT -3, 1, 'StatusBitsA', 0, 0, 0, 0, 2147483647, 0, 0
INSERT INTO @schema
SELECT -2, 1, 'StatusBitsB', 0, 0, 0, 0, 2147483647, 0, 0
INSERT INTO @schema
SELECT -1, 2, 'LenFixedBytes', 52, 0, 10, 0, 2147483647, 0, 0
SELECT 
s.*,
CASE WHEN s.leaf_offset > 1 AND s.bitpos = 0 THEN 
SUBSTRING
(
@RowLogContents, 
ISNULL((SELECT TOP 1 SUM(x.length) FROM @schema x WHERE x.[column] < s.[column] AND x.leaf_offset > 1 AND x.bitpos = 0), 0) + 1,
s.length
)
ELSE
SUBSTRING
(
@RowLogContents, 
(col.columnOffValue - col.columnLength) + 1,
col.columnLength
)
END AS hex_string
FROM @schema s
LEFT OUTER JOIN @colOffsetTable col ON col.colNum = (s.leaf_offset)

Your results should look like the table below. The hex_string is now broken into individual segments with its corresponding column. The system_type_id will allow us in the next step to define rules for the conversion of each column type into clear text. In a later series I'll discuss using the xprec and xscale in the conversion of numeric and decimal values, but for now these columns will not be used. Is_uniqueifier is populated with 1 when a clustered index is created on a table where no primary key is defined. We defined a primary key which created our clustered index, so all the values are 0. Once again if we had inserted a NULL value into our data, the value in is_null would be 1 for that column.

Finally

The final piece of the puzzle, all that remains is to wrap our final SELECT statement inside the logic to convert the hex_string into its corresponding data type. I've also added a few additional data types, just for example sake. So feel free to alter and add fields to the Crack_Me_If_You_Can table and see the results that come out of it.

 SELECT
 [name] AS ColumnName, 
 CASE WHEN s.is_null = 1 THEN NULL ELSE 
 CASE 
 WHEN s.system_type_id IN (167, 175, 231, 239) THEN LTRIM(RTRIM(CONVERT(NVARCHAR(MAX), REVERSE(REVERSE(REPLACE(hex_string, 0x00, 0x20))))))
 WHEN s.system_type_id = 48 THEN CONVERT(NVARCHAR(MAX), CONVERT(TINYINT, CONVERT(BINARY(1), REVERSE (hex_string))))
 WHEN s.system_type_id = 52 THEN CONVERT(NVARCHAR(MAX), CONVERT(SMALLINT, CONVERT(BINARY(2), REVERSE (hex_string))))
 WHEN s.system_type_id = 56 THEN CONVERT(NVARCHAR(MAX), CONVERT(INT, CONVERT(BINARY(4), REVERSE(hex_string))))
 WHEN s.system_type_id = 127 THEN CONVERT(NVARCHAR(MAX), CONVERT(BIGINT, CONVERT(BINARY(8), REVERSE(hex_string))))
 WHEN s.system_type_id = 61 THEN 
 CONVERT(VARCHAR(MAX), 
 CONVERT(DATETIME, SUBSTRING(hex_string, 4, 1) + SUBSTRING(hex_string, 3, 1) + SUBSTRING(hex_string, 2, 1) + SUBSTRING(hex_string, 1, 1))
 + CONVERT(DATETIME, DATEADD(dd, CONVERT(INT, SUBSTRING(hex_string, 8, 1) + SUBSTRING(hex_string, 7, 1) + SUBSTRING(hex_string, 6, 1) + SUBSTRING(hex_string, 5, 1)), 0x00000000))
 , 109)
 WHEN s.system_type_id = 108 AND s.xprec = 5 AND s.xscale = 2 THEN CONVERT(VARCHAR(MAX), CONVERT(NUMERIC(5,2), 0x050200 + hex_string))
 WHEN s.system_type_id = 108 AND s.xprec = 6 AND s.xscale = 2 THEN CONVERT(VARCHAR(MAX), CONVERT(NUMERIC(6,2), 0x060200 + hex_string))
 WHEN s.system_type_id = 108 AND s.xprec = 6 AND s.xscale = 3 THEN CONVERT(VARCHAR(MAX), CONVERT(NUMERIC(6,3), 0x060300 + hex_string))
 WHEN s.system_type_id = 108 AND s.xprec = 7 AND s.xscale = 2 THEN CONVERT(VARCHAR(MAX), CONVERT(NUMERIC(7,2), 0x070200 + hex_string))
 WHEN s.system_type_id = 108 AND s.xprec = 8 AND s.xscale = 2 THEN CONVERT(VARCHAR(MAX), CONVERT(NUMERIC(8,2), 0x080200 + hex_string))
 WHEN s.system_type_id = 108 AND s.xprec = 9 AND s.xscale = 2 THEN CONVERT(VARCHAR(MAX), CONVERT(NUMERIC(9,2), 0x090200 + hex_string))
 WHEN s.system_type_id = 108 AND s.xprec = 10 AND s.xscale = 2 THEN CONVERT(VARCHAR(MAX), CONVERT(NUMERIC(10,2), 0x0A0200 + hex_string))
 END
 END AS ClearText
FROM (
 SELECT 
 s.*,
 CASE WHEN s.leaf_offset > 1 AND s.bitpos = 0 THEN 
 SUBSTRING
 (
 @RowLogContents, 
 ISNULL((SELECT TOP 1 SUM(x.length) FROM @schema x WHERE x.[column] < s.[column] AND x.leaf_offset > 1 AND x.bitpos = 0), 0) + 1,
 s.LENGTH
 )
 ELSE
 SUBSTRING
 (
 @RowLogContents, 
 (col.columnOffValue - col.columnLength) + 1,
 col.columnLength
 )
 END AS hex_string
 FROM @schema s
 LEFT OUTER JOIN @colOffsetTable col ON col.colNum = (s.leaf_offset)
) AS s
WHERE [column] > 0 AND is_uniqueifier = 0

Later in this series I'll show how to write a function that will allow us to join directly to fn_dblog and return clear text for each row with "LOP_INSERT_ROWS" operations. I'll cover how to handle "Update" operations in a future series, for now you'll be limited to "Inserts". However, don't let that curtail your fun in examining the transaction log; there are still plenty of exploring left and interesting things to be found.

Rate

4.8 (137)

You rated this post out of 5. Change rating

Share

Share

Rate

4.8 (137)

You rated this post out of 5. Change rating