Odd Scenario with MSSQL Backup...details inside

  • So I was presented with an odd outcome...one which is pretty unique.

    We have a DB which houses FileStream data (8 TB's)!

    It wasn't always this size, due to growth of the application it ballooned out of control to this size.

    It takes 8 days to do a native MSSQL backup of the database.   Recently, there was an issue and a request came down to restore the database to the last full backup to recover data out of the primary file group.

    So quick scenario, the backup started on 10-14 and finished on 10-22...

    We go into the table that the data is needed and the oldest record is not that much older than when the backup started, giving the impression that the 'database backup'  was done in short order and the remaining 8 days was the time it takes to back up all the FileStream data.

    There as activity on the table in question throughout the 8 day window, it is just not showing up in the restored backup.

    I was under the impression that the T-log would hold all the changes that occurred throughout the 8 days, and append them to the FULL....so that when the FULL was restored it would be consistent to the 22nd time frame.

    The fact that no data exists past the start time of the full backup gives me a different impression.

    It makes me feel like the full backup ran through the data file and backed up those DB pages and the little bit of the T-log to keep the database transnationally consistent....and then had to also include the FileStream stuff in the backup which takes the majority of the time.

    It seems a little deceiving based on the example I use from Paul Randal

    https://technet.microsoft.com/en-us/library/2009.07.sqlbackup(d=printer).aspx

    Otherwise I do not know how  to explain why there is only data from the 14th in the table (day / time the backup started) and none from that point forward even though the backup of the database ran for 8 days.

  • The Full backup would mark a checkpoint which is the start of the backup.  Any data that exists prior to that checkpoint would be in the full backup.  Any data that changed since the backup started would not be captured in the backup.

    It doesn't makes sense for the backup to contain data that happens after it starts.  This would result in transactional inconsistency.

    Lets say that user A starts the backup.  User B executes 'begin transaction".  Then the backup completes before User B commits the transaction.  Database dies, so you restore it.  It doesn't make sense to hold the "begin transaction" in that database.

    That being said though, the open transaction will not be stored even if User B ran their thing first, then user A started the backup.  The only thing that will be in the backup is completed transactions at the time the backup starts.

    The T-Log (depending on your backup settings) will hold the transactions that occur between tlog backups.  If you are in full recovery and don't do tlog backups, your logs will continue to grow forever.
    If you are in full recovery, make sure to do a full backup and tlog backups regularly.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • bmg002 - Monday, November 6, 2017 2:37 PM

    The Full backup would mark a checkpoint which is the start of the backup.  Any data that exists prior to that checkpoint would be in the full backup.  Any data that changed since the backup started would not be captured in the backup.

    It doesn't makes sense for the backup to contain data that happens after it starts.  This would result in transactional inconsistency.

    Lets say that user A starts the backup.  User B executes 'begin transaction".  Then the backup completes before User B commits the transaction.  Database dies, so you restore it.  It doesn't make sense to hold the "begin transaction" in that database.

    That being said though, the open transaction will not be stored even if User B ran their thing first, then user A started the backup.  The only thing that will be in the backup is completed transactions at the time the backup starts.

    The T-Log (depending on your backup settings) will hold the transactions that occur between tlog backups.  If you are in full recovery and don't do tlog backups, your logs will continue to grow forever.
    If you are in full recovery, make sure to do a full backup and tlog backups regularly.

    Pretty sure it doesn't work that way.
    If you just use some checkpoint mark (?) when the backup began, then what happens if you start a transaction before the backup and it commits while the backup is in process? That will end up with inconsistency - you just lost that transaction based on that logic. And that logic would mean that nothing could process while a backup is running - since you are beginning a transaction. It's all based on LSNs, not the start of the backup. And it actually can have transactions that began after the backup started. It uses LSNs for a lot of reasons including the reason I provided in the first example.
    The backup records the LSN when the backup starts and again when it ends. The backup threads copy the extents and if changes need to be done to extents during the backup, they can be done. The backup can contain more than just committed transactions at the time the backup started. It can contain transactions that began and committed during the backup. The backup does include relevant portions of the log as the poster indicated. This links explain it more as well as the link that the OP provided:
    Fuzzy backups and RESTORE DATABASE WITH STOPAT

    Sue

  • I've been trying to reply to this thread all day.

    I apologize for the misinformation.  I did more snooping and I am definitely incorrect in what I had posted.  Sue is correct here.
    The method I was thinking for backups would end up blocking all database changes during a backup and that does not occur.  I should have looked before I posted.

    Again, sorry about the bad information.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply