Possible oddity in msdb.dbo.backupset

  • Hi all

     

    I've written a script to keep track of how long backups are taking, which databases got a full or differential backup, and a few other bits.

    I'm using msdb.dbo.backupset as my base table but I think I've found something strange.

    According to that table, we managed to backup a 300GB database in 8 seconds (see picture).

    This could cause us issues when tracking how long backups are taking.

     

    I've narrowed it down to two possibilities:-

    • I'm an idiot and looking at it wrong/using the wrong table
    • msdb.dbo.backupset is reporting the times incorrectly

     

    As a bit of background, we use an SSIS package to control the backup routine and we backup 4 databases at a time.

    We do one full backup every 7 weeks (due to space constraints) and we keep the last 5 differentials (again, due to space constraints).

    All databases are in SIMPLE mode as they are designed to catch up from where they left off if we have to restore a backup.

     

    Let me know if you need any more info.

     

    TIA

     

    Richard

    • This topic was modified 1 year ago by richardmgreen1. Reason: Added better screenshot
    Attachments:
    You must be logged in to view attached files.
  • My first thought would be to check the is_Snapshot column. It could be the case that the backup is being taken by a snapshot.

  • @Coffee_&_SQL - I had thought of that but we've got as table that contains the databases that need to be backed up (with dates for the next full backup) and none of the snapshots are in there.

    Don't get me wrong, if we could do a backup of a 300GB database in 8 seconds (or backup a 400GB database in 9 seconds), I'd be announcing it to the world but I think msdb.dbo.backupset is lying.

  • Are you using a 3rd party backup application that's called from your SSIS package?  Potentially, maybe if that's the case, the backup application is taking a VSS snapshot (which is very quick,) and then backing up from that (which on the back end would take longer.)

    This wouldn't show up as a "snapshot backup" because as far as SQL is concerned, it's not.

  • @jasona.work - no, it uses dynamic SQL to do the backup (full or differential). We do have a server-level backup done by a 3rd-party but that's only the complete server (it's a VM).

    • Is it only the one database running into this, or are there others (or all of them?)
    • Do your differential backups show a similar behavior?
    • Any other SQL Servers (ideally, separate VMs, even) doing this?
    • Just to verify, as well, your backup inside the SSIS package then is nothing more than "BACKUP DATABASE [FOO] TO..."
  • This table is written to by various backup processes, where they log what they do. As noted, potentially you have something doing some sort of snapshot that is recorded as a backup, which might take seconds because of hardware tricks or some sort of snapshot.

    I wouldn't be too worried about this, as it likely isn't a problem. I might ignore these, and not use them in RPO calcs unless I know what this actually is.

  • richardmgreen1 wrote:

    @jasona.work - no, it uses dynamic SQL to do the backup (full or differential). We do have a server-level backup done by a 3rd-party but that's only the complete server (it's a VM).

    Steve's comment got me thinking and I have a hunch I know where you 8 second backup is coming from AND it could potentially be a problem.

    It *MAY* be your VM backup software backing up the VM.  Easy to check if this is on the right track, compare the backup times with the fast backups to when the VM is being backed up.  Where, if this is right, that could be a problem is, SQL will see those backups as a valid full database backup and base any differential backups off that, which makes restoring the database, shall we say, problematic.

    I had a similar issue not long after we migrated our databases to VMs in Azure being backed up by Azure backup for the VMs, it took some hunting to sort out what we had to do to STOP the databases from being included and let our Agent job backups be in control.  As I recall, it's a registry key (again, this is specific to an Azure environment, your VM backup software likely has something similar)

  • In some semblance of order:-

     

    @jasona.work -

    • Is it only the one database running into this, or are there others (or all of them?) - all full backups are showing the same behaviour unless they've had a backup done in the last week or so
    • Do your differential backups show a similar behaviour? - No.  They appear to be reporting the correct start/finish times
    • Any other SQL Servers (ideally, separate VMs, even) doing this? - I've checked another server that has the same backup routine in place and that appears to be exhibiting some of the same behaviour but it's not for all databases.  I'll check which databases are showing what appears to be the correct behaviour to see when they were last scheduled for a full backup
    • Just to verify, as well, your backup inside the SSIS package then is nothing more than "BACKUP DATABASE [FOO] TO..." - The SSIS package just calls a stored procedure for the backup (I can send the code for checking/critiquing if you want to take a look) but it essentially does just run the BACKUP DATABASE command

     

    @steve-2 Jones - thanks for that.  It's not used for RPO/RTO timings yet but may be used in the future so I'd like to get the most accurate information possible.

     

    @jasona.work - We do the database backups (fulls) on a Sunday and the diffs are done daily.  The server-level backup done by the 3rd-party application is started on a Friday (I've had this confirmed by our infrastructure team)

     

    ETA - According to the MSDB table, all the backups were started on the same day (for the fulls) which I know isn't correct.  It does show that as a Friday so it's possible that it's the server-level backup but that backup takes several days (even snapshotting/cloning it for backup takes more than a few seconds).

    Assuming it is the server-level backup that's causing the issue, the question is why would backing up an entire VM affect the backup table (this backup doesn't run any SQL at all).

  • So what I've seen with a couple different backup applications is, the backup application takes a VSS snapshot of the drive(s,) and SQL "sees" that as a full backup and records it in the table you're querying.

    Here's where it gets ugly.  SQL sees it as a Full database backup and any following differentials use that point as their base.

    I would strongly, strongly recommend, if at all possible, you do a test restore of your database somewhere, using a Full backup from Sunday and a Differential from Saturday AFTER the VM backup has run.  My suspicion is, the restore will fail.

    If that happens, then the next step would be to check the VM backup software to see if it's possible to exclude SQL databases from the backup, or if there's a registry key to set to accomplish that, or possibly to modify your database backup schedule so the full backups run very shortly after the VM backup runs (and even then, there's going to be a brief window of time where you could still lose data)

  • @jasona.work - Thanks for that, at least it means I'm not going insane.  I'll speak to our infrastructure team to see exactly what the backup application does in terms of VS snapshots.  Hopefully it is just that, but, from you've put, that could also cause some issues.

    Looks like I'm shelving my current project and doing some restore testing to see if we will be having issues.

    If it is the backup solution, it also looks like I'll have to adjust our backup schedule to do a full backup weekly (that will cause it's own issues).

     

    Oh well, possibly back to the drawing board.......

     

    ETA - I'm trying to restore one our smaller databases that has the following backups:-

    Full taken on 29/09/2024

    Last differential backup taken 22/10/2024

     

    If this works, then all is good except for the times.

    If it doesn't, we've got problems.

  • Keep in mind, I'd expect your VM backup application to have some method to "turn off" backing up the SQL databases, or at least causing it to take a normal Full backup of the databases.  Possibly an option to make any database backups be a copy only backup.

    Definitely work with the infrastructure team to look into it.  Possibly the worst-case solution would be to configure the VM backup to exclude the volume(s) your databases are on (if possible.)

  • I used the database I chose for a couple of reasons:-

    1. it's small so doesn't take long to restore
    2. There's been a least one server-level backup between the full and the differential I'm wanting to restore

    The good news is I've just restored one of smaller databases successfully using the GUI (I just restored it under a slightly different name to stop it overwriting the current database.

    I've checked a few tables and all the data appears to be there and accessible.

    From that (although it is only a single test), it looks like we can successfully restore databases.

    It now looks like it's just msdb.dbo.backupset that's telling lies.

     

    According to the backupset table, the last full backup was taken 2024-10-11 11:34:20.000 and finished at 2024-10-11 11:34:28.000

    According to my logging, the backup was between 2024-10-22 20:53:16.280 (start) and 2024-10-22 20:56:38.690 (finish).

    Looks like I can't trust the backupset table for timings and I'll have to use my own logging.

     

    Thanks for all the help folks.

  • I'd still say you can trust the backupset table, because I'd still bet money (only a quarter, granted, but money) that the VM backup is causing that 8s backup entry.  The good news, though, is that you CAN restore despite that, which also suggests to me that your infrastructure team / the VM backup software *IS* configured correctly to not screw up SQL backup chains.

    Possibly what you could do for your reporting script is, configure it to exclude / ignore backups that would be about the time your VM backups are taken.

Viewing 14 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply