Blog Post

Transforming SSIS' sysssislog Entries Into Something More Readable

,

The format of sysssislog in SSIS 2005 and 2008 isn't great for human consumption, so what follows is an attempt to reformat it for easier digestion.  It isn't perfect - and here's hoping that the new Denali (the next version of SQL Server) server-side toolset will improve the situation it as much as I think it will.
Raw sysssislog Table Listing
My difficulty is with the nature of the logging - it "bubbles up" the same message to each containing task. The result is that the same message is reported multiple times in the log, which doesn't make it very concise (for reading). There's no doubt that it does make it easier to see everything that happened to or within a specific task - but that's not typically how I read the logs.  Issuing a simple query such as:
SELECT id, event, starttime, source, message FROM sysssislog ORDER BY id
Results in a rowset like this (click to enlarge):

My difficulty with the above is that (in this case) there are two rows per "message" - and that's just a simple demo package.  More typically, I've got five or more nested containers, resulting in a lot of repetition.
The Distilled Query
My desire was to distill the log records such that there was only one row per "message" instead of several.  You can't simply group by the message, of course. You have to group only adjacent entries and also (preferably) still show the involvement of each level of the Task hierarchy, because that context information is still valuable.
What I've arrived at is the following, which will distill the last 1000 entries in the log into something less wordy.  It's still not ideal - tasks executing in parallel get mashed together.  Those of you who are experts in T-SQL and windowing functions will no doubt be able to optimize this.  Please post a comment if you do!
WITH
  
LOG_SUBSET AS
  
(
    
SELECT TOP 1000 id, REPLACE(event, 'User:', '') AS event, starttime, source, message
      FROM
sysssislog
      
ORDER BY id DESC
  
),
  
ORDERED_LOG_SUBSET AS
  
(
    
SELECT TOP 1000 id, event, starttime, source, message
      FROM
LOG_SUBSET
      
ORDER BY id
  
),
  
EVENTTYPED_LOG AS
  
(
    
SELECT id, event, starttime, source, message,
      
HASHBYTES('SHA1', CAST(starttime AS CHAR(20)) + event + message) AS eventtype_group_key
    
FROM ORDERED_LOG_SUBSET
  
),
  
CONTIGUOUS_EVENT_GROUPED_LOG AS
  
(
    
SELECT id, event, starttime, source, message, eventtype_group_key,
      
ROW_NUMBER() OVER (ORDER BY id) AS s1,
      
ROW_NUMBER() OVER (PARTITION BY eventtype_group_key ORDER BY id) AS s2,
      
ROW_NUMBER() OVER (ORDER BY id) - ROW_NUMBER() OVER (PARTITION BY eventtype_group_key ORDER BY id) AS sequence
    
FROM EVENTTYPED_LOG
  
),
  
GROUPED_LOG AS
  
(
    
SELECT id, event, starttime, source, message, eventtype_group_key, s1, s2, sequence,
      
HASHBYTES('SHA1', CAST(eventtype_group_key AS CHAR(36)) + CAST(sequence AS CHAR(10))) AS event_group
    
FROM CONTIGUOUS_EVENT_GROUPED_LOG
  
)SELECT MIN(id) AS id, event, starttime, message,
    
SUBSTRING(REPLACE(REPLACE(
      (
        
SELECT ' < ' + source
          
FROM GROUPED_LOG AS Y
          
WHERE X.event_group = Y.event_group
          
GROUP BY source
          
ORDER BY MIN(id) DESC
            FOR XML
PATH('')
      ),
      
'&lt;', '<'), '&amp;', '&'), 3, 2000) AS source_list
  
FROM GROUPED_LOG AS X
  
GROUP BY event, starttime, message, event_group
  
ORDER BY MIN(id)

A sample of the resulting rowset (click to enlarge):

It's not fantastic, but it does improve my diagnostic tracing a little bit.
Thanks to information from Boneist and Tom Cooper on grouping, and Anith Sen on pivoting.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating