Testing Database Restores

  • I'm looking for some feedback from the community. I finally got permission to setup a server to do test restores. Yes, I know, you don't have backups until you know you can restore from them. Things move slowly at some places. My questions are the following:

    1. Do you test each full backup every day or do you sample every other day or once a week? With one test server, I don't think we could make it through testing each backup for all our DBs in 24 hours.
    2. Do you also do DBCC CHECKDB on the restored backups or just run that on the server where the backups were taken?
    3. Our largest DB is about 800GB. I know we're going to have to experiment to see how much memory to put on this server but does anyone have a suggestion for picking a starting point?
    4. Networking wants to install the test SQL Server on the fileshare server that hosts all of our backups. I get it from the standpoint of you're not moving all those bits over the network to restore them, but I think this might negatively affect the backup server I/O while our servers are trying to write full and log backups to it and then do restores and possibly DBCC CHECKDB.

    I'd love to hear your thoughts on this and anything else you might like to add.

    Thanks All!

    -Tom

  • Tom Uellner wrote:

    I'm looking for some feedback from the community. I finally got permission to setup a server to do test restores. Yes, I know, you don't have backups until you know you can restore from them. Things move slowly at some places. My questions are the following:

     

      <li style="list-style-type: none;">

    1. Do you test each full backup every day or do you sample every other day or once a week? With one test server, I don't think we could make it through testing each backup for all our DBs in 24 hours.

     

      <li style="list-style-type: none;">

    1. Do you also do DBCC CHECKDB on the restored backups or just run that on the server where the backups were taken?

     

      <li style="list-style-type: none;">

    1. Our largest DB is about 800GB. I know we're going to have to experiment to see how much memory to put on this server but does anyone have a suggestion for picking a starting point?

     

      <li style="list-style-type: none;">

    1. Networking wants to install the test SQL Server on the fileshare server that hosts all of our backups. I get it from the standpoint of you're not moving all those bits over the network to restore them, but I think this might negatively affect the backup server I/O while our servers are trying to write full and log backups to it and then do restores and possibly DBCC CHECKDB.

     

    I'd love to hear your thoughts on this and anything else you might like to add.

    Thanks All!

    -Tom

    I'm on the same journey - we (meaning me) do restores from the latest backup once per month, split up throughout the month

    we don't do DBCC checkdb on restore (to keep the timings down) ,but I know that we should

    I restore quite a few databases that are around 600GB (maybe 12 of them) and the little box i have has only 8GB of RAM the key was getting SSDs in place - memory is only going to be a factor if you have memory optimised tables

    It's possibly cheaper for your network team to buy you a workstation/ small server with a few SSD rather than consume valuable space on the fileserver... but thats only my opinion

     

    MVDBA

  • Do you test each full backup every day or do you sample every other day or once a week? With one test server, I don't think we could make it through testing each backup for all our DBs in 24 hours.

    I would test restore all your production environment databases as often as you can.  I have one database I restore every day, and all the rest once a week.

    Do you also do DBCC CHECKDB on the restored backups or just run that on the server where the backups were taken?

    We do DBCC CHECKDB on the source databases once a week in low workload hours.

    Our largest DB is about 800GB. I know we're going to have to experiment to see how much memory to put on this server but does anyone have a suggestion for picking a starting point?

    As Mike mentioned, memory on this server isn't as important as disks, unless the server you are restoring to is also a development or QA environment server.

    Networking wants to install the test SQL Server on the fileshare server that hosts all of our backups. I get it from the standpoint of you're not moving all those bits over the network to restore them, but I think this might negatively affect the backup server I/O while our servers are trying to write full and log backups to it and then do restores and possibly DBCC CHECKDB.

    Again I think this is going to depend on the purpose of these restored databases and the purpose of the file server.  Is this file server used for anything besides the SQL backups?  Will the restored databases be used for a development or QA environment?  If either of those is yes then putting this instance on the file server sounds like a bad idea.

     

  • MVDBA (Mike Vessey) wrote:

    I'm on the same journey - we (meaning me) do restores from the latest backup once per month, split up throughout the month

    we don't do DBCC checkdb on restore (to keep the timings down) ,but I know that we should

    I restore quite a few databases that are around 600GB (maybe 12 of them) and the little box i have has only 8GB of RAM the key was getting SSDs in place - memory is only going to be a factor if you have memory optimised tables

    It's possibly cheaper for your network team to buy you a workstation/ small server with a few SSD rather than consume valuable space on the fileserver... but thats only my opinion

    Thanks Mike. This is very helpful. We don't have any memory optimized tables so I won't worry about memory right out of the gate. They can always add more memory if we need it.

    I like the idea of a small server with some SSDs but I know they aren't going to go that path at the moment. They have decommissioned several older servers and want to give me space out of that pool for data, log and tempdb. I'm thinking as long as I have the enough disk space for the largest DB that we have plus maybe 20% - 40% for growth we should be OK to start.

  • Chris Harshman wrote:

    Do you test each full backup every day or do you sample every other day or once a week? With one test server, I don't think we could make it through testing each backup for all our DBs in 24 hours.

    I would test restore all your production environment databases as often as you can.  I have one database I restore every day, and all the rest once a week.

    Do you also do DBCC CHECKDB on the restored backups or just run that on the server where the backups were taken?

    We do DBCC CHECKDB on the source databases once a week in low workload hours.

    Our largest DB is about 800GB. I know we're going to have to experiment to see how much memory to put on this server but does anyone have a suggestion for picking a starting point?

    As Mike mentioned, memory on this server isn't as important as disks, unless the server you are restoring to is also a development or QA environment server.

    Networking wants to install the test SQL Server on the fileshare server that hosts all of our backups. I get it from the standpoint of you're not moving all those bits over the network to restore them, but I think this might negatively affect the backup server I/O while our servers are trying to write full and log backups to it and then do restores and possibly DBCC CHECKDB.

    Again I think this is going to depend on the purpose of these restored databases and the purpose of the file server.  Is this file server used for anything besides the SQL backups?  Will the restored databases be used for a development or QA environment?  If either of those is yes then putting this instance on the file server sounds like a bad idea.

    Thanks Chris. I appreciate the help.

    The files share is only for SQL Server backup files. I get why networking wants to put SQL Server on this VM since the bits won't have to go over the wire to another server to be restored. It will only be used for validating backups. No DEV or QA use. Like you, we have one database that I'd really like to test each backup daily if we have enough time and then slot in the others round robin as we can.

    Chris / Mike - Are you using home grown tools for doing the test restores or are you using something else? I'm going to take a look at dbatools because I'm pretty sure there will be some goodies in there to help automate this. Just curious what you both are using.

  • Tom Uellner wrote:

    MVDBA (Mike Vessey) wrote:

    I'm on the same journey - we (meaning me) do restores from the latest backup once per month, split up throughout the month

    we don't do DBCC checkdb on restore (to keep the timings down) ,but I know that we should

    I restore quite a few databases that are around 600GB (maybe 12 of them) and the little box i have has only 8GB of RAM the key was getting SSDs in place - memory is only going to be a factor if you have memory optimised tables

    It's possibly cheaper for your network team to buy you a workstation/ small server with a few SSD rather than consume valuable space on the fileserver... but thats only my opinion

    Thanks Mike. This is very helpful. We don't have any memory optimized tables so I won't worry about memory right out of the gate. They can always add more memory if we need it.

    I like the idea of a small server with some SSDs but I know they aren't going to go that path at the moment. They have decommissioned several older servers and want to give me space out of that pool for data, log and tempdb. I'm thinking as long as I have the enough disk space for the largest DB that we have plus maybe 20% - 40% for growth we should be OK to start.

    beware growth on restore - if the backup occurred when the log file was big then you might run out of space on restore.

    normally we don't advocate shrinking the log file, but on a restore test to a limited resource server then a shrink might be needed before backup

    MVDBA

  • We have automated weekly restores ( including checkdb ) .

    This target server has a C-drive and one big raid-0 volume as that is its only purpose.

    It holds max 2 user databases on that big drive (to keep size in control ).

    Databases are immediately removed after their restore/checkdb operation.

    Filesystems bandwidth and  speed is most important.

    I would indeed advocate shipping it to its own box and disks as, given the size, it may interfere or suffer the file servers shared consumption noticeably.

     

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • One thing I would add, I was really tempted to drop databases after the restore. but the number of times a developer needed to get some data back because they deleted it  and I have the magic wand of "oh here it is" just look at the DR server.

    I know the purpose of a DR test is to make sure we can recover, but who says we can't use it afterwards

    MVDBA

    1. Everything if you can, but certainly many can't, for valid reasonsY. So, triage. "We can't live without it" databases get regular and constant love through testing. Everything else gets scheduled for occasional checks.
    2. Yes. In fact, for every database you can do regular DBCC checks against their backup, you only have to do physical DBCC checks on their production side, radically reducing load (logical checks are the more process intensive ones).
    3. Testing is your buddy. I've got nothing.
    4. Nope. With the full knowledge that people are frequently on a single SAN and what looks like separation of storage, isn't. Separate the storage of your tests from your backups. Why on earth would you intentionally interfere with the safety net.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • MVDBA (Mike Vessey) wrote:

    One thing I would add, I was really tempted to drop databases after the restore. but the number of times a developer needed to get some data back because they deleted it  and I have the magic wand of "oh here it is" just look at the DR server.

    I know the purpose of a DR test is to make sure we can recover, but who says we can't use it afterwards

    If you can keep them around, they do offer other benefits. However, then you need to be even more sure of security since you're creating an additional vector, and one outside of what might be a more restrict production environment. Absolutely not knocking what you're doing, I use to do the same, just pointing out it has implications as well as benefits.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey wrote:

    MVDBA (Mike Vessey) wrote:

    One thing I would add, I was really tempted to drop databases after the restore. but the number of times a developer needed to get some data back because they deleted it  and I have the magic wand of "oh here it is" just look at the DR server.

    I know the purpose of a DR test is to make sure we can recover, but who says we can't use it afterwards

    If you can keep them around, they do offer other benefits. However, then you need to be even more sure of security since you're creating an additional vector, and one outside of what might be a more restrict production environment. Absolutely not knocking what you're doing, I use to do the same, just pointing out it has implications as well as benefits.

    I get what you are saying,  but this little baby has no internet access and the only attack vector is an internal idiot... maybe there is a case for not leaving them around... force the dev team to get me to restore it for them?  as a process we have much more control, but it takes maybe 2 hours longer to get a fix in place...… it's not an easy choice

     

    MVDBA

  • Grant Fritchey wrote:

    4. Nope. With the full knowledge that people are frequently on a single SAN and what looks like separation of storage, isn't. Separate the storage of your tests from your backups. Why on earth would you intentionally interfere with the safety net.

    Thanks Grant. This is what was in the back of my mind too. Networking wants to save space and bandwidth but I don't think they're thinking all the way through the issue.

    And thanks for the idea about doing full CHECKDB on the restores and changing production to physical. Maintenance is starting to take a long time on some of these servers and that would be a way to save ourselves some time.

  • Thanks everyone for all the ideas. I appreciate your thoughts on this.

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply