Blog Post

How to identify the source of page splits in a database

,

Page splits can slow down your queries and increase the amount of I/O needed to retrieve your data.

So, how do you monitor and identify the source of page splits, so you can avoid them in your databases?

The short answer is that you can monitor page splits with extended events. There’s an event called page_split. The problem is that there are many kinds of page split. You only need to catch the problematic ones, but the page_split event doesn’t identify this.

When one page becomes full and a new page is needed, this is reported as a page split. But this is a regular operation – it doesn’t necessarily have bad consequences for your queries.

The problem occurs with updates and non-sequential inserts, when a row needs to be inserted in the middle of an object’s pages and there’s no space.

SQL Server creates a new page, transfers half of the page data to the new page and writes the row data. This creates page fragmentation, which is very bad for performance. This is also reported as page split.

You can find these problematic page splits using the event sql_server.transaction_log. This event monitors all the activities in the transaction log, so we need to use it with caution.

Specifically, you need to filter the operation field, looking for the value 11, which records LOP_DELETE_SPLIT. This is the deletion of rows that happens when SQL Server moves rows from one page to another in a ‘bad’ page split.

This event isn’t visible in the Extended Event interface, so we need to create a session using T-SQL

CREATE EVENT SESSION [BadSplits] ON SERVER 

ADD EVENT sqlserver.transaction_log (   

WHERE operation = 11 — LOP_DELETE_SPLIT

)

ADD TARGET package0.event_file

 — You need to customize the path   

( SET filename = N’C:\xel\badsplits.xel’ )

GO 

— Start the session

ALTER EVENT SESSION [Blocked]

ON SERVERSTATE = START;

GO

After some time, we can query this session to find the source of a page split. The first challenge is to find the database where the page splits are happening.

To read the information in the files, you need to use the DMF sys.fn_xe_file_target_read_file. The information will be returned in XML, so we need to use XML methods to retrieve the information:

WITH    qry   

AS ( SELECT

— Retrieve the database_id from inside the XML document

theNodes.event_data.value   

(‘(data[@name=”database_id”]/value)[1]’,’int’)

AS database_id FROM

(SELECT CONVERT(XML, event_data)

    event_data — convert the text field to XML

FROM    — reads the information in the event files

sys.fn_xe_file_target_read_file  

(‘c:\xel\badsplits*.xel’,NULL, NULL, NULL)) theData

CROSS APPLY theData.event_data.nodes

   (‘//event’) theNodes ( event_data ))

SELECT  DB_NAME(database_id) ,COUNT(*) AS total

FROM    qry

GROUP BY DB_NAME(database_id)

   — group the result by database

ORDER BY total DESC

Once you know the database with the most page splits, you need to identify the objects that are causing the problem. This is a bit tricky, but here’s how it works (and there’s a code example at the end).

The event only returns the allocation_unit_id, so you need to convert the allocation_unit_id to the object name that owns these pages. The query to convert the allocation_unit_id to an object name needs to run on the database that contains the allocation units, so we’ll filter the results for this specific database.

We now need to join the result with the DMV sys.allocation_units using the allocation_unit_id field. This DMV has three kinds of allocation units but we are interested only in two of them: type 1 and type 3, which represent data row pages and row overflow pages.

Next, we need to add sys.partitions to the query, and join the container_id field of sys.allocation_units with the hobt_id field in sys.partitions. Finally, we add sys.objects to the query, joining with the object_id field from sys.partitions, to get the name from sys.objects:

WITH qry AS

        (SELECTtheNodes.event_data.value

   (‘(data[@name=”database_id”]/value)[1]’,’int’)

     AS database_id,

theNodes.event_data.value

   (‘(data[@name=”alloc_unit_id”]/value)[1]’,’varchar(30)’)

     AS alloc_unit_id,

theNodes.event_data.value

   (‘(data[@name=”context”]/text)[1]’,’varchar(30)’)

     AS context

        FROM                (SELECT CONVERT(XML,event_data) event_data 

              FROM

        sys.fn_xe_file_target_read_file

               (‘c:\xel\badsplits*.xel’, NULL, NULL, NULL)) theData

   CROSS APPLY theData.event_data.nodes(‘//event’)

               theNodes(event_data) )

SELECT name,context,COUNT(*) AS total

  — The count of splits by objects

FROM qry,sys.allocation_units au, sys.partitions p, sys.objects ob

WHERE qry.alloc_unit_id=au.allocation_unit_id   AND au.container_id=p.hobt_id

AND p.object_id=ob.object_id  AND (au.type=1 or au.type=3)

AND        db_name(database_id)=’MDW’ — Filter by the database

GROUP BY name,context — group by object name and context

ORDER BY name

Now that we know which table is causing the most page splits, you can analyze its indexes to solve the problem.

Although this solution can identify page split problems, you can’t let the session run for a long time, because you’re capturing transaction log activities, which can be too intensive for the server.

Again thanks Red Gate . and  do like the post if you find it useful ..

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating