Sanity Check - Ola Hallengren Job sheduling

  • Hi,

    I am setting up Ola's SQL Server Maintenance solution and I am not entirely sure about the setting up of my jobs scheduling.

    I ran the script to set up all the SPs, tables and jobs. This worked without any issues.

    Then I went onto the scheduling of those jobs and I am struggling with the timing of things. Hopefully someone can advise.

    Note that the DB is in use during the normal working hours of a normal day (i.e. 09:00 to 17:00) - and nobody should use the DB after 22:00.

    At the moment I have the following:

    DatabaseBackup - SYSTEM_DATABASES - FULL: run every work day at 22:30

    DatabaseBackup - USER_DATABASES - FULL: run every work day at 23:30

    (it is not possible that the first job (sys dbs) would take one hour to run, but I can just as well leave one hour between the two)

    DatabaseBackup - USER_DATABASES - LOG: well .. this is where I was starting to wonder if what I'm doing makes any sense at all ...

    I was going to schedule that one for circa 00:00:00 (i.e. 30 mns after the user full backup) - wise or not wise?

    And then I get to the DB integrity checks and the index optimize procs - I'm thinking that it's better to take the DB backups before checking integrity and optimizing indexes (since there is a tiny chance that I may break the DB files when runnning those procs) - is that correct?

    Essentially, what I'm asking is about the order of those jobs - what did you do? and possibly why did you do it like that?

    Cheers,

    B

  • Hi,

    I ended up setting up the following schedule:

    Index optimize to take place every Sunday at 20:00:00

    DatabaseIntegrityCheck - SYSTEM_DATABASES: every weekday and Sunday at 22:00:00

    DatabaseIntegrityCheck - USER_DATABASES: every weekday excluding Sunday at 23:00:00

    DatabaseBackup - SYSTEM_DATABASES - FULL: every weekday and Sunday at 23:50:00

    DatabaseBackup - USER_DATABASES - FULL: every weekday and Sunday at 00:30:00

    I came to this decision partly based on the information found on:

    http://serverfault.com/questions/235657/setting-up-ola-hallengrens-sql-jobs

    But what about the other jobs?

    Output File Cleanup

    sp_delete_backuphistory

    sp_purge_jobhistory

    :unsure: :unsure:

    Cheers,

    B

  • Surely you must have at least one database that is not Simple recovery model?

    So you need to schedule the log backup job as well. The schedule for this depends on your recovery point objective ie. how much data can you lose if everything goes pear-shaped and you have to restore from backups.

    Commonly this means anything from every 5 minutes to every 4 hours.

    If you don't backup the logs at all, they will grow and eventually fill the disk - unless you change the recovery model to Simple.

    As far as the cleanup jobs are concerned, they are very quick and can run at any time - I currently run them at 5am.

  • Hi Richard,

    thank you very much for your answer.

    I do indeed have a couple of DBs with FULL recovery model, so thanks for pointing out that important omission! :Whistling:

    I wasn't sure about the cleanup jobs, I will schedule them for a weekly run on early Monday.

    Kind regards,

    B

  • There we go - I have now setup the following schedule:

    Index optimize to take place every Sunday at 20:00:00

    DatabaseIntegrityCheck - SYSTEM_DATABASES: every weekday and Sunday at 22:00:00

    DatabaseIntegrityCheck - USER_DATABASES: every weekday excluding Sunday at 23:00:00

    DatabaseBackup - SYSTEM_DATABASES - FULL: every weekday and Sunday at 23:50:00

    DatabaseBackup - USER_DATABASES - FULL: every weekday and Sunday at 00:30:00

    DatabaseBackup - USER_DATABASES - LOG: Every 2 hours, every day, from 07:00:00 to 19:00:01

    Output File Cleanup: Every Sunday at 07:00:00

    sp_delete_backuphistory: Every Sunday at 07:00:00

    sp_purge_jobhistory: Every Sunday at 07:00:00

    Any feedback is welcome,

    B

  • First, you're doing a great job choosing a well known and respected set of scripts, and actually taking the time to plan things out.

    So, you have a maintenance window of A:00 to B:00 (in your case A: is 22:00)?

    A more general way of figuring out your maintenance/backup schedule:

    In summary:

    Know your requirements

    Know your system's capabilities

    Know your system's limitations

    Design your plan based on both of the above

    In detail:

    1) What is your Recovery Point Objective (RPO) on each database - can you afford to lose 10 minutes of data? An hour? Two days worth?

    1a) What is your archival requirement - i.e. on which, if any, databases do you need Point In Time Recovery (PITR)?

    i.e. "Show me what was in the database at 11:03 yesterday. Then show me what was in the database at 8:45 three days ago."

    2) What is your Recovery Time Objective (RTO) on each database - i.e. how long do you have from the time of failure to the time it's expected to be up again? Go as far down the list as is reasonable for _your_ business; but be clear to your management about which situations you are not planning to handle, or in fact cannot possibly handle without more resources.

    2a) Do you believe you can run a worst-case restore in this amount of time? A best-case restore?

    2b) If the system databases are down?

    2c) If the operating system has to be reinstalled?

    2d) After the hard drives crash and new hard drives are delivered? Where are you backups stored?

    2e) After the entire server caught fire? Where are your backups stored?

    2f) After three servers (including the backup server) caught fire, so your hardware folks have their attention split? Where else are your backups stored? How long does it take to get them?

    2g) After the building caught fire/stepped on by Godzilla/destroyed in hurricane/flooded/stolen by aliens/collapsed? Where else are your backups stored? How long does it take to get them?

    2h) After the entire region has had some disaster? Where else are your backups stored? How long does it take to get them when the roads are blocked and your regional power and communications grid is mostly not available?

    3) During your maintenance window, are you allowed to lock tables/indexes entirely?

    3a) If so, strongly consider setting Index maintenance to do offline index rebuilds, since the locking won't affect anything!

    3b) Are you allowed to take down the server? If so, this is also your window to apply Windows and SQL Server patches and security updates.

    Remember SQL Slammer? YES, you need to apply SQL Server service packs and security updates.

    4) Where are your backups being saved? Hopefully not to the same set of spindles the databases live on. If on a SAN, check with the SAN admin.

    5) How are you verifying CHECKDB results?

    6) Realize that backups can be done during the business day, FULL or otherwise! Full backups merely may slow things down a little. But, actually running real work also slows things down a little! Are you going to tell people not to use the server, because using it slows it down?

    7) Can new databases be added by anyone but you? Can anyone else change recovery models? Do you ever change a recovery model and forget to change your backup types?

    7a) FULL and BULK LOGGED recovery models REQUIRE transaction log backups.

    7b) DO NOT set a job to truncate the transaction log. That's what Simple recovery model is for.

    7c) Note that Ola's database selection SP can be easily modified to add "special" options, like ALL_DATABASES, for the various recovery models. I do this, so when databases appear or recovery models change, they're automatically given appropriate backups. Particularly appropriate to DB's for third party apps.

    8) How long does each maintenance take? How long do you expect it to take in the future, worst case, after 2 years of data growth?

    8_) First do index maintenance and defrag at the OS level; start clean to get your measures.

    8a) Full backups? And how big are they?

    8b) Index maintenance (of each type - reorg, online rebuild, offline rebuild - offline rebuilds can be faster, particularly on Enterprise/Datacenter)? If you have any index/tables that are much larger than everything else, how long do those take?

    8c) Differential backups? And how big are they?

    8d) Transaction log backups? And how big are they?

    8e) Checking for untrusted constraints?

    8f) CHECKDB PHYSICAL_ONLY?

    8g) CHECKDB with DATA_PURITY and EXTENDED_LOGICAL_CHECKS?

    9) Now, take all your answers from the above.

    9a) You _must_ have backup chains running at least as often as your RPO.

    9b) How much tolerance is there for a bad Full backup? With Full+T-Log backups, you can start with a previous Full and just restore more T-log backups to restore to the required point in time. Diff backups do not work unless the Full backup immediately prior works. Now, a bad T-log backup stops the chain right there. A bad Diff backup means you just use another Diff with the immediately prior Full.

    9c) Note that orgs with a very small RPO often run T-log backups every 5-15 minutes, all day, every day. Since a T-log backup is incremental, they occupy about the same about of space per day whether you run 4 a day or 400 a day.

    9d) Keep going through what you have, and how long it takes. Chart it on a calendar if you want!

  • Hi Nadrek,

    thank you so much for all that information.

    You would be quite disappointed as the databases I put this on are quite small (but it's a small organisation too, so that's ok :cool:)

    You would also hate the fact that the backups are placed on the same spindle as the actual databases -

    http://dbareactions.tumblr.com/post/42362387262/when-i-discover-that-the-database-backups-are

    ... this is to be changed at the earliest opportunity of course! ... to be fair, there is a parallel backup process that writes out the full backups to tape overnight so that's not too bad .. although the tapes are in same building (you can't have everything 🙂

    RTO and RPO are relatively 'low' (or high - whatever), the organisation is (for now) OK with loosing one day's worth of data, although that will have to be revisited in due course. In mean time, I am taking log backups every 2 hours.

    Thanks again for your post, very useful indeed,

    Kind regards,

    B

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply