Reboot Server to solve long running maintenance jobs.

  • I have a VM server on Windows 2012 and SQL 2012 SP3 with 4 instances and 294 gb of memory and 16 CPU's connect to a Nimble sand.  I have noticed with after our monthly maintenance when the server gets rebooted our database maintenace jobs (index optimization and checkdb) run very fast every day for the first week.  But after that they start to take longer and longer to the point where they run for 4 to 5 hours.  At this point we start having blocking and other performance issues.  If we reboot the server everything goes back to normal and then the cycle start over.  As an experiment I flushed the memory with several DBCC commands.   But this made no difference.  Only a reboot seems to solve the problem.  Am I dealing with a memory issue or an IO issue or some other issue.  The only thing I can do for SQL is apply CU7.  There is nothing I can do for the sand(which uses a default blocking of 8K pages).

  • Have you done any analysis to determine which one (index or checkdb) is causing the long maintenance window?

    You're trying to do index maintenance every day, what method are you using to do that?  You may want to consider doing index maintenance less frequently, and not be so worried about index fragmentation, something the Brent Ozar group has written a lot about:
    https://www.brentozar.com/?s=index+fragmentation

    When you do index maintenance, I hope you're using one of the free, smarter ways of managing it such as Ola Hallengren scripts:
    https://ola.hallengren.com/
    or Minionware scripts:
    http://minionware.net/#miniontabs|1
    instead of the way that Maintenance Plans work, which is an all or nothing approach.
    I've also found that the old recommended 5% to 30% reorg, > 30% rebuild is very aggressive, and may result in a lot of extra maintenance work without much observable benefit.  A side effect of an index rebuild, is that statistics is regenerated from the full scan performed on indexing, and ensuring that statistics are up to date is more important than how fragmented your indexes may be, unless your usage pattern is mainly large scans versus lots of seeks.

    Running CHECKDB can be done many different ways, what options are you using?  There's lots of advice available for optimizing how that runs:
    https://sqlperformance.com/2012/11/io-subsystem/minimize-impact-of-checkdb

  • Chris Harshman - Tuesday, January 31, 2017 11:22 AM

    Have you done any analysis to determine which one (index or checkdb) is causing the long maintenance window?

    You're trying to do index maintenance every day, what method are you using to do that?  You may want to consider doing index maintenance less frequently, and not be so worried about index fragmentation, something the Brent Ozar group has written a lot about:
    https://www.brentozar.com/?s=index+fragmentation

    When you do index maintenance, I hope you're using one of the free, smarter ways of managing it such as Ola Hallengren scripts:
    https://ola.hallengren.com/
    or Minionware scripts:
    http://minionware.net/#miniontabs|1
    instead of the way that Maintenance Plans work, which is an all or nothing approach.
    I've also found that the old recommended 5% to 30% reorg, > 30% rebuild is very aggressive, and may result in a lot of extra maintenance work without much observable benefit.  A side effect of an index rebuild, is that statistics is regenerated from the full scan performed on indexing, and ensuring that statistics are up to date is more important than how fragmented your indexes may be, unless your usage pattern is mainly large scans versus lots of seeks.

    Running CHECKDB can be done many different ways, what options are you using?  There's lots of advice available for optimizing how that runs:
    https://sqlperformance.com/2012/11/io-subsystem/minimize-impact-of-checkdb

    I wish it was a matter of a bad index or database.  I am using Ola's scripts.  These maintenance scripts are just the easiest way for us to see things are getting slow and slower.  Even if I turn these scripts off, which I have done when they started running for 8 hours,  the application are already having performance issues.  We are also getting the message that a sigificant part of SQL Server process memory has been paged out.  This has only started in the last 3 months with no problems the year before.

  • rwyoung01 - Tuesday, January 31, 2017 12:02 PM

    Chris Harshman - Tuesday, January 31, 2017 11:22 AM

    Have you done any analysis to determine which one (index or checkdb) is causing the long maintenance window?

    You're trying to do index maintenance every day, what method are you using to do that?  You may want to consider doing index maintenance less frequently, and not be so worried about index fragmentation, something the Brent Ozar group has written a lot about:
    https://www.brentozar.com/?s=index+fragmentation

    When you do index maintenance, I hope you're using one of the free, smarter ways of managing it such as Ola Hallengren scripts:
    https://ola.hallengren.com/
    or Minionware scripts:
    http://minionware.net/#miniontabs|1
    instead of the way that Maintenance Plans work, which is an all or nothing approach.
    I've also found that the old recommended 5% to 30% reorg, > 30% rebuild is very aggressive, and may result in a lot of extra maintenance work without much observable benefit.  A side effect of an index rebuild, is that statistics is regenerated from the full scan performed on indexing, and ensuring that statistics are up to date is more important than how fragmented your indexes may be, unless your usage pattern is mainly large scans versus lots of seeks.

    Running CHECKDB can be done many different ways, what options are you using?  There's lots of advice available for optimizing how that runs:
    https://sqlperformance.com/2012/11/io-subsystem/minimize-impact-of-checkdb

    I wish it was a matter of a bad index or database.  I am using Ola's scripts.  These maintenance scripts are just the easiest way for us to see things are getting slow and slower.  Even if I turn these scripts off, which I have done when they started running for 8 hours,  the application are already having performance issues.  We are also getting the message that a sigificant part of SQL Server process memory has been paged out.  This has only started in the last 3 months with no problems the year before.

    Has the database usage changed in the last few months?  More new records, more modifications, was a new batch job added etc?

  • you mentioned this was a virtual machine "VM server on Windows 2012 and SQL 2012 SP3 with 4 instances and 294 gb of memory and 16 CPU's"
    is there anything else running on the VM besides SQL Server or is it the 4 instances fighting with each other over the 294 GB?  You might also want to check if there is a balloon driver stealing memory from the VM.
    https://www.brentozar.com/archive/2012/11/how-to-set-sql-server-max-memory-for-vmware/

  • And another thought is that it may not be SQL Server at all. I wouldn't necessarily limit any monitoring to just SQL Server. I'd include the server itself as well - maybe something isn't running correctly with the system file cache. Capture baselines after the reboot, capture performance metrics again when things start slowing down, etc.
    And then again, it could be SQL Server. At this point, I'm not sure it's clear where the issue is. You could also just have some large crappy queries that are causing problems. But there just isn't enough info to say whether it's SQL, the jobs, Windows, etc.
    With the error about paging out the memory, if you get that when SQL Server starts up it's likely something you can ignore. At other times, there are several different things that can cause issues. Here is a good read on the subject and a script that will provide detailed information on OS and SQL memory usage (it's the personal site of a guy who works at MS in SQL Server): 
    A significant part of SQL Server process memory has been paged out

    Sue

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply