MSDB growth on sysjobstepslogs due to whitespace

  • Oh, and yes, the table is still around. It was not a temp table, as I used it as a backup first in case the records were needed. The above data collection that looks low in space used but higher in rows is from the backup table.

  • matt.newman (6/23/2011)


    Oh, and yes, the table is still around. It was not a temp table, as I used it as a backup first in case the records were needed. The above data collection that looks low in space used but higher in rows is from the backup table.

    Are you able to repair that one with the steps in the article? If you are then you have your workaround (not pretty, but should work).

  • I don't see anything to repair with the backup table. The data size for that table is as expected. No new data is going into that table. It is not from a restore point. I would need to restore a backup from msdb to another server to examine further and I can't have those sql agent jobs run on any system other than production even shutting down dbmail which it already is on another server there are too many variables where I don't know what is doing this and I need to finish tasks assigned in a sprint. It is coming down to time, and a battle axe solution.

    I only see an issue with space not being reclaimed from the actively used table, presumably due to something being caught in a rule that is a broken rule.

  • I never thought I would be trying to perform and exorcism of ghosts as a DBA.

    I have found a few more resources. Since the issue is recurring and I would like a better understanding of the behind the scenes I am going to take the links for homework. Restart of the SQL Agent might clear something out, but if I keep seeing ghosts even after I truncate the table... where everything is committed, I would like to put a stop to the error in a better way than battle axe to snapshot isolation. I will post findings after I schedule some maintenance time or such. Most likely will be on hold for a little bit until I finish some other tasks. Hopfully I will have a script to locate errors and correct the issue but don't know at this stage.

    http://sqlserverpedia.com/wiki/Trace_Flags#DBCC_TRACEON.283604.29_and_DBCC_TRACEON.283605.29

    http://www.sqlskills.com/blogs/paul/post/inside-the-storage-engine-ghost-cleanup-in-depth.aspx

    http://blogs.msdn.com/b/sqlserverstorageengine/archive/2006/12/13/more-undocumented-fun_3a00_-dbcc-ind_2c00_-dbcc-page_2c00_-and-off_2d00_row-columns.aspx

    http://www.sqlservercentral.com/blogs/sqlinthewild/archive/2011/6/7/on-the-exorcism-of-ghost-records.aspx

  • I am curious if anyone has used dbcc forceghostcleanup on production and it caused any issues. Not looking to put someone at a liable state, just looking to see if it caused any trouble. I have two tables that I have tried to force ghost cleanup using a few means.

    use msdb;

    go

    sp_spaceused; -- report for space used before

    go

    sp_spaceused sysjobstepslogs; -- report for space used before

    go

    select * into [LOGS].[dbo].[TMPsysJobStepsLogs071111] from msdb.dbo.sysjobstepslogs; -- move "active" (non-ghost) records to temp table

    go

    truncate table msdb.dbo.sysjobstepslogs; -- clean up all contents of table, done to prevent large log buildup from final step (listed below)

    go

    alter index PK__sysjobstepslogs__1B0907CE on sysjobstepslogs rebuild; -- clean up index

    go

    sp_spaceused; -- report for after

    go

    sp_spaceused sysjobstepslogs; -- report for after

    select * from sysjobstepslogs with (index=PK__sysjobstepslogs__1B0907CE) -- tried to force scan on table, no affect

    select ghost_record_count, version_ghost_record_count, * from sys.dm_db_index_physical_stats (db_id('msdb'), object_id('sysjobstepslogs'), null,null,'DETAILED') -- record_count, ghost_record_count, version_ghost_record_count: 23331, 125315, 0

    dbcc tracestatus(661) -- flag is off, ghost cleanup SHOULD be working.

    TraceFlagStatusGlobalSession

    661000

    Each day, I get roughly 1 GB unreleased space from that table where after 10 days I need to truncate to alleviate. I cannot restart the server or service. There has been no stopping these ghosts from creeping up. SQL Server 2005 SP4. All patches referenced for this I could find, already applied. MSDB has only 1 file. Why the process is refusing to remove these ghosts, I can't say. It just started happening one day, a couple days after I got hired in. The only thing I have found online that says anything like this is resolved by is a service restart... to which I cannot do at this time.

  • Not me.

    I'd try contacting the MVP you refferenced earlier. I think this is your best best at this point. It's unlikely than more than 1-2 people, if anyone on this site has experience with this.

  • matt.newman (6/23/2011)


    I never thought I would be trying to perform and exorcism of ghosts as a DBA.

    I have found a few more resources. Since the issue is recurring and I would like a better understanding of the behind the scenes I am going to take the links for homework. Restart of the SQL Agent might clear something out, but if I keep seeing ghosts even after I truncate the table... where everything is committed, I would like to put a stop to the error in a better way than battle axe to snapshot isolation. I will post findings after I schedule some maintenance time or such. Most likely will be on hold for a little bit until I finish some other tasks. Hopfully I will have a script to locate errors and correct the issue but don't know at this stage.

    http://sqlserverpedia.com/wiki/Trace_Flags#DBCC_TRACEON.283604.29_and_DBCC_TRACEON.283605.29

    http://www.sqlskills.com/blogs/paul/post/inside-the-storage-engine-ghost-cleanup-in-depth.aspx

    http://blogs.msdn.com/b/sqlserverstorageengine/archive/2006/12/13/more-undocumented-fun_3a00_-dbcc-ind_2c00_-dbcc-page_2c00_-and-off_2d00_row-columns.aspx

    http://www.sqlservercentral.com/blogs/sqlinthewild/archive/2011/6/7/on-the-exorcism-of-ghost-records.aspx

    Hi Matt, just wondering if you ever resolved this ??

    I have a similar problem on my production box. sysjobstepslogs is growing about 100 meg per day, currently 3 gig, even though the # of records stays constant about only 66.

  • Sort of... ish... yes. I have not seen the issue since my last server restart but I have not confirmed it was a result of an attempted "fix" done.

    You can try out a DBCC CleanTable if you are comfortable with that, to try and recover from the LOB_Data allocation unit, in case the ghost record is stuck in there and not releasing the "lock". I would recommend against the truncate to the table, even if that is an acceptable solution - depending on many things but I would not recommend using that unless you have a very small system and you know it inside and out of what that tables holds and what you need it for. The truncate was my emergency solution until I could restart the server.

    The overall solution I had was restart the server. I have not run into this since last time, and prior to me working here it was seen many more times. The issue typically popped up when there was some kind of IO write problem, either on the switch or the OS level.

    DBCC Cleantable can be intensive. Study up and test out any use on a good test box before going forward.

    http://technet.microsoft.com/en-us/library/ms189792(v=sql.90).aspx

    http://technet.microsoft.com/en-us/library/ms174418(v=sql.90).aspx

    Sorry if the wording in this is off, it's been a while since I have dealt with this. I have not had IO write problems since last time I had this issue come up (at least on this box directly.

  • Also, you may have something set to "Append" to existing record, whcih would also make the growth occur. I'd have to know if you checked for space used internally for each record to see if it's a valid record or a ghosted one/ series of ghosted ones.

    USE [msdb]

    GO

    select name, step_name, flags, * from sysjobsteps js

    join sysjobs sj on sj.job_id = js.job_id

    where flags <> 0 /* you are looking for 16 or 20 */

    In the sysjobstepslogs table there is a log size that you can look at to make sure it is small, and you can use the sum datalength for the table as well - as previously mentioned. If it is ghost records, you'd have to verify using the index stats, but that select takes a toll on the servers.

  • I lloked up cleantable and it says it is for reclaiming space after dropping a 'max' column, so I don't think that would help me. No jobs set to "append". I hopeI have it resolved. Here's what I did:

    -- Identify "big" record based on log_size column

    select log_size, log.step_uid , job.name, step.*

    from sysjobstepslogs log

    join sysjobsteps step on step.step_uid = log.step_uid

    join sysjobs job on job.job_id = step.job_id

    -- disabled “Log to Table” for that job

    delete from sysjobstepslogs -- delete "big" record

    where step_uid = '2285168D-C19A-417A-8E5B-FC270D3C3274'

    -- Save Data

    select * into sysjobstepslogs_TEMP

    from sysjobstepslogs

    -- Truncate Live

    truncate table sysjobstepslogs

    -- Insert saved data back to live. Table is now small

    set IDENTITY_INSERT dbo.sysjobstepslogs ON

    INSERT INTO msdb.dbo.sysjobstepslogs

    (log_id, log ,date_created ,date_modified

    ,log_size ,step_uid)

    select log_id, log ,date_created ,date_modified

    ,log_size ,step_uid

    from sysjobstepslogs_TEMP

    --Shrink MSDB database to get back space.

  • Truncate will clear out the records BUT if you have a recurring ghost problem like I did, it is not an end all solution.

    If you are getting the "big record" size from the table itself then it is not the same problem as I was having with unseen records from the table itself. You might very well have a lot of white space saving into the log record, then. A varchar(max) in a select that saves in the log output is the main cause for that. Wherever your code is for the logging for the job step in question, you need to make sure you have no select from anything with a varchar(max) output in the result. If there is a varchar(max) record, wrapping a substring function of some measure around it will help prevent large growths on that table.

  • Also, you set identity insert on, but then not off.

Viewing 12 posts - 16 through 26 (of 26 total)

You must be logged in to reply to this topic. Login to reply