Help with SQL Server backups slow ( backup performance slower than before after data purge)

  • Hi everyone,

    (searched SQL Central... I didn't find anything specific)

    Wanted to get some ideas on solutions. I have a DB about 3TB in size which usually took about 55-65mins on average to back up on SAN.

    A data file was added because due to low on space. A 500GB was added with no auto-growth (managed actively) ... other data files existing in past are allocated as 500GB files.

    --- backup was normal 55-65 mins on average

    Afterwards sunday data purged 500GB

    -- backup ran slow after ..nearly about 2 hours

    Then purged another sunday 500GB

    ---same thing slow about about 2 hours

    Finally another sunday 200GB

    -- same thing slow about 2 hours

    The database has indexes rebuilt during the weekend by a custom database script created by the vendor. The files are allocated 500GB so I thought fragmentation would be more logical than physical. However with indexes rebuilt so there should be less logical fragmentation in DB.

    This DB is backed up to a SAN. Perfmon shows higher than normal I/O responses times...

    BUT ONLY for this 3TB DB trimmed to 1.8TB.

    Because there's a Database approximately 500GB on same SAN and LUNs/Mountpoints... with no degradation in Backup performance. (No purging for 500GB DB)

    ALSO I skimmed at the custom re-index by vendor, it seems to rebuild the index for the table that had the data purged from it. (FYI data purge was to specific DB table)

    Idea/comments suggestions :blush:

    thanks for everyone's time!

    --------------------------------------------------
    ...0.05 points per day since registration... slowly crawl up to 1 pt per day hopefully 😀

  • Does the purged table have a clustered index?

    Was the log backed up after the purge?

    How big is your log file and how much of it is in use?

    Is the new file in the same path as the other files?

    How are you backing up the database?

    - native

    - native with compression

    - 3rd party tool

  • Hi Thanks for reply!

    Does the purged table have a clustered index?

    Yes there is a clustered index, this is one in vendor custom job/stored proc that rebuilds the index using "DBCC REINDEX" (no fill factor is specified by vendor) - maybe because the data is always "inserted" (deletions Rarely happen) so they didn't include fill factor.

    Was the log backed up after the purge?

    Not right away, backup occurs later in the day, and backs up the log.

    Purges occur in the Sunday early mornings. (it's new it seems this will happening from now on... after the first few Large purges it should be much smaller)

    Based on your comments and what is happening to the DB there would be some log growth, but doesn't seem to impact backups so much maybe?? (see comments later about how there was little impact on duration of backup)

    I looked at the Job that's been created for DB Backup

    1. Shrinks the log down to 10GB

    2. Backup Prod

    3. Backup Log with INIT (as a backup device... another job hourly backups of to this initialized device)

    How big is your log file and how much of it is in use?

    The log file is sized 10GB daily to accommodate requirements restores to reporting system

    The "Shrink" log is 10GB, and expands in 1GB increments, this would impact purge of the data time and re-indexing would cause log file growth, but since that's a separate job. "Backup job" taking long still is unclear. Also I queried the MSDB on backup times for the log backups top 3 backup times for logs were 23 minutes, 11 minutes, 6 minutes long, so it seems backing up of the logs as part of "DB Backup job" is not high impact to duration.

    I simplified my original post, I should not have here are actual backup times and sizes.

    -Purging occurs in the early morning on Sundays

    -Index Maint only occurs on Sundays

    -Backups occur later after Index Maint.

    Backup Size time(mins) Day of week.

    --------------------------------------------

    2,657,367,657,47265 Tuesday

    2,665,147,928,57660 ....

    2,674,100,114,43260 ....

    2,681,622,008,83269 ....

    2,689,124,472,83255 ....

    2,270,140,401,664132 Sunday Purge

    2,096,109,193,216167 Monday Purge

    2,107,319,995,392103 ....

    2,114,000,817,152109 ....

    2,121,961,771,008109 ....

    2,128,885,945,344105 ....

    2,135,120,547,840104 ....

    1,710,387,889,152158 Sunday Purge

    1,668,634,818,560114 Monday no purge

    1,670,652,529,664106 ....

    1,656,420,707,328103 ....

    1,664,357,968,896109 ....

    1,671,628,600,320129 Sunday No Purge

    1,678,234,466,304105 ....

    1,319,545,214,976148 Sunday Purge

    1,321,366,946,816132 Monday no purge

    Is the new file in the same path as the other files?

    Yes it's on the same LUN as another Datafile.

    Configuration of datafiles are:

    Datafile 1 & 2 are on same LUN

    Datafile 3 & 4 are on same LUN

    Datafile 5 & 6 are on same LUN

    How are you backing up the database?

    - native

    - native with compression

    - 3rd party tool

    Using SQL Native with Compression.

    Thanks for any other insight and thoughts!

    --------------------------------------------------
    ...0.05 points per day since registration... slowly crawl up to 1 pt per day hopefully 😀

  • 1. If the transaction log is not backed up after the index rebuild or purge and before the the log shrink,

    the shrink will likely fail, because there will be more than 10gb of data in the log.

    2. If the transaction log is not backed up before the full backup, you are backing up the contents of the log

    generated by the index rebuild and purge.

    3. I the time you are looking at is the combined full backup and log backup and the log backup was 23 minutes

    after the purge, it makes sense that the total time would be at least:

    normal duration + 23 minutes log backup + ~23 minutes backing up the contents of the log in the full backup.

    Are you not backing up the log regularly (i.e. hourly or every 15 minutes?)

  • 1. If the transaction log is not backed up after the index rebuild or purge and before the the log shrink,

    the shrink will likely fail, because there will be more than 10gb of data in the log.

    Hourly Transaction logs are ocurring throughout the day so this I believe clears the logs prior to the Backup job (or at least more of it )

    (1) Shrinks log to 10 GB (2) Backup DB (3) Re-Init Log backup Device(file) .... hourly logs continue afterwards

    a different job manages that process.

    2. If the transaction log is not backed up before the full backup, you are backing up the contents of the log

    generated by the index rebuild and purge.

    You're right I think that's why there was a huge 23 mins worth of LOG that needed to be backed up. But that only happened once the 23 minutes, most of the time the logs backup in a few minutes on an hourly basis.

    below are backups by hour (including re-initialized ones) sorted by size of log and minutes it took to backup... this was sorted by all log backups from MSDB ordered by longest time.

    SizeGB Minutes

    ------------------

    290 23

    149 11

    99 6

    22 2

    12 1

    10 1

    8 1

    8 0

    8 0

    7 0

    6 0

    6 0

    6 0

    6 0

    ........

    .....

    ....

    ....

    3. I the time you are looking at is the combined full backup and log backup and the log backup was 23 minutes

    after the purge, it makes sense that the total time would be at least:

    normal duration + 23 minutes log backup + ~23 minutes backing up the contents of the log in the full backup.

    Unfortunately adding 23 minutes + (average "Normal" backup time of 60mins) doesn't come close to 100+ mins so I'm stumped.

    Are you not backing up the log regularly (i.e. hourly or every 15 minutes?)

    Yes every hour so odd I thought... I'm fairly new to the system so maybe there's something I'm missing.

    --------------------------------------------------
    ...0.05 points per day since registration... slowly crawl up to 1 pt per day hopefully 😀

  • The full backup isn't backing up the log, unless the index rebuild is running while the full backup is running. Then it will catch items changed during the data reading portion of the backup.

    After the purges, is the size of the full backup unusually large? Or does it not take into account the 500GB purge?

  • When you delete a large or very large numberr of rows at one time, SQL Server defers the actual deletes. The next time a page with deleted r0w(s) is modified (or read?: I can't remember if you have to just read the page or do a mod to it), the previously logically deleted ("ghost" rows) are physically deleted. What you are seeing might be related to that aspect of how SQL handles large deletes.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Thanks Steve for reply!

    The full backup isn't backing up the log, unless the index rebuild is running while the full backup is running. Then it will catch items changed during the data reading portion of the backup.

    The index job runs sometime in the early morning (Purge job even earlier), jobs are ran sequentially/dependent on completion of each other. The backup job setup happens towards the later part of the day.

    During the regular day hourly backups are occurring, so log isn't filled. Checking MSDB I can see the log being backed up.

    After the purges, is the size of the full backup unusually large? Or does it not take into account the 500GB purge?

    Good question on size differences seems sizes have been decreasing. After purges the size of the backup has gotten progressively smaller, it's now about 40% smaller in size than when this all started. Just longer to backup.

    --------------------------------------------------
    ...0.05 points per day since registration... slowly crawl up to 1 pt per day hopefully 😀

  • Thanks for mentioning

    When you delete a large or very large numberr of rows at one time, SQL Server defers the actual deletes. The next time a page with deleted r0w(s) is modified (or read?: I can't remember if you have to just read the page or do a mod to it), the previously logically deleted ("ghost" rows) are physically deleted. What you are seeing might be related to that aspect of how SQL handles large deletes.

    Interesting, I didn't know that about how SQL Server handles large deletes. I'll have to research and read more about this.

    In meantime, I'm getting some baselines on a test server of the Database... try some experiments.:ermm:

    --------------------------------------------------
    ...0.05 points per day since registration... slowly crawl up to 1 pt per day hopefully 😀

  • You can also test the if the bottleneck is coming from the backup disk or your data disk.

    Backup to nul to test backup read speed, if it doesn't change (or is even quicker) then you know you have an issue with the backup disk you are writing to.

    It's a simple test to close some doors.

    BACKUP DATABASE [MyDB] TO DISK = 'nul' WITH COPY_ONLY

    Copy_only to not break your log backup chain.

    This is usually quicker than a regular backup since the bottleneck is often the write speed.

    You can also play around with the number of stripe, BUFFERCOUNT and MAXTRANSFERSIZE (Books Online).

  • What is the auto growth of your lDF file?

    I am about to blame the number of VLF files in your database. Your Purge might be adding lots of small sized vlf files which in turn , may cause the backup process run slow. Backup process obviously "reads" ldf file, large numbr of small size vlf files will delay.

    http://www.sqlskills.com/blogs/kimberly/transaction-log-vlfs-too-many-or-too-few/

    http://www.sqlskills.com/blogs/kimberly/8-steps-to-better-transaction-log-throughput/

    Next time you do purge, do Pre size your ldf file accordingly.

  • You can also test the if the bottleneck is coming from the backup disk or your data disk.

    Backup to nul to test backup read speed, if it doesn't change (or is even quicker) then you know you have an issue with the backup disk you are writing to.

    It's a simple test to close some doors.

    BACKUP DATABASE [MyDB] TO DISK = 'nul' WITH COPY_ONLY

    Copy_only to not break your log backup chain.

    This is usually quicker than a regular backup since the bottleneck is often the write speed.

    You can also play around with the number of stripe, BUFFERCOUNT and MAXTRANSFERSIZE (Books Online).

    Thanks for suggestion, that's a good one, I'd forgotten about backing up to 'nul' - will try that out this week!

    --------------------------------------------------
    ...0.05 points per day since registration... slowly crawl up to 1 pt per day hopefully 😀

  • SQL Show (5/8/2013)


    What is the auto growth of your lDF file

    I am about to blame the number of VLF files in your database. Your Purge might be adding lots of small sized vlf files which in turn , may cause the backup process run slow. Backup process obviously "reads" ldf file, large numbr of small size vlf files will delay.

    http://www.sqlskills.com/blogs/kimberly/transaction-log-vlfs-too-many-or-too-few/

    http://www.sqlskills.com/blogs/kimberly/8-steps-to-better-transaction-log-throughput/

    Next time you do purge, do Pre size your ldf file accordingly.

    Autogrow set to 1GB increments. (That will get addressed in future)

    Interesting to point out number of VLFs could be the issue, but prior to all the multiple purges the number of VLFs was around 200s though (backup times back then were under 65mins),

    After purge it's around 200s also, so not much change, in number of VLFs, I completely ruled it out....But now I'd like to give that a try just to see, because any ideas are worth a shot. Thanks for suggestion, worth a try - wouldn't have considered without your suggestion

    --------------------------------------------------
    ...0.05 points per day since registration... slowly crawl up to 1 pt per day hopefully 😀

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply