High response times and queue lenthing duringlog back up

  • are the log backup jobs for all databases running concurrently or sequentially?

    has it always done this or has it just started happening? If recent, anything in the application/system log around the time it started?

    As previously asked, are you using a 3rd party compression tool? ie. redgate sqlbackup, hyperbac, quest, idera etc.

    _________________________________________________________________________________SQLGeordieWeb:- Jarrin ConsultancyBlog:- www.chrisjarrintaylor.co.ukTwitter:- @SQLGeordie

  • No we are use the standard SQL 2008 Log backups

    transation logs are large 1 is 50GB the rested are about 20GB

  • ...and how frequently are you running them?

    David

    @SQLTentmaker

    “He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

  • Log backups run every 30 minutes

    Full backups daily

  • Wow - you have 50 GB of changes every 30 minutes on a given database and 20 GB on all others. That is a lot of data change in a short period of time and that is a lot of data to write to disk in a short period of time with a log backup. You could increase the frequency but that may not help things overall but will "spread" them out rather. Not sure if that would be a good thing though. I would be really interested to hear what you are doing that is generating that much change.

    Side note here but if you are dumping that much log backup to disk every 30 minutes you are going to have to increase your throughput at the disk level in order for that to perform quickly and not cause an impact to your database server.

    David

    @SQLTentmaker

    “He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

  • gmcrouch (3/5/2012)


    No we are use the standard SQL 2008 Log backups

    transation logs are large 1 is 50GB the rested are about 20GB

    Are you using SQL 2008 compressed backups? Also, can you confirm whether the actual log data you're backing up is that size or that's the physical size of the log file on disk? What i'm getting at is that the log file (.ldf) can be 50gb on disk but may only contain 10gb of "logged" data within the 30mins. Sorry but trying to put it in layman's terms as you've stated you're not a DBA

    They could have been pre-sized at 50GB/20GB or possibly grown at some point to that size, if the latter is the case then 50GB/20GB of data change every 30mins is quite large and based on the info provided thats:

    DB528: 50GB

    DB529: 20GB

    DB530: 20GB

    DB531: 20GB

    db506: 20GB

    DB510: 20GB

    DB514: 20GB

    DB518: 20GB

    which by my calculation is 190GB of data backed up every 30mins...... :w00t:

    _________________________________________________________________________________SQLGeordieWeb:- Jarrin ConsultancyBlog:- www.chrisjarrintaylor.co.ukTwitter:- @SQLGeordie

  • Along with Chris's note, can you run the following;

    dbcc sqlperf (logspace) and provide the output.

    Thanks.

    David

    @SQLTentmaker

    “He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

  • dbcc sqlperf (logspace)

    Database NameLog Size (MB)Log Space Used (%)Status

    master19.992192.0271590

    tempdb15359.990.075289440

    model19.9921912.758890

    msdb49.992195.0378970

    3305065075.11714.14790

    3905308806.182.4855810

    Distrib542.24222.0721970

    34051018755.685.6197460

    3505142587.74235.163210

    3605185075.11725.759780

    39052819999.990.20161140

    39052916268.34.8302080

    3905311916.05537.133160

    You guys think it hardware almost everythings been changed by other techs including lots of drives

    I'll start replacing erverything again Storage (HP MSA 50) Midplane and backplane replaced last night no change.

    Next up controller cache and storage IO mod. Drives with Highest responds time are on storage (Data Drives .MDF)

    Mother board has been replaced already can replace again.

    I defraged drives seemed to help first back up. not much to back up stoped SQL for defrag SQl only up for 5 min before back up

    but 2nd was back to Hi response long queue low IO.

    I think it looks like a fragmentation or contention issue because of High response LOW IO.

    Copy test from log drive to bak drive has High IO 150 MBs spikes over 200MB

  • Based on that it doesn't look like your log backups will be that big. Another question, what performance measures are you seeing for avg disk sec / read and avg disk sec / write on the drive that you are writing to? They would be under the physical disk counters in performance monitor.

    Thanks.

    David

    @SQLTentmaker

    “He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

  • I just finished replacing Controller HP p800 flashed to 7.50 with new cache

    also MSA 50 IO mod has been changed

    during non back up times

    IO is between 50 and 200MB a second response time under 50MS Queue lenght under 5 ( Queue does spike to over 1000 sometimes for short period)

    1 minute after back up starts

    Response times and queue lenght jump and IO drops to about 1MB

    Response over 15000MS Queue Lenght over 1500

    get several event 833 errors in app log (IO request taking longer then 15 seconds)

    higest reponse time is on reads to .MDF files but is high accross all disk

    issue is still there.

    here is resouce manager during last log backup (top 10 by response time)

    ImagePIDFileRead (B/sec)Write (B/sec)Total (B/sec)I/O PriorityResponse Time (ms)

    sqlservr.exe984H:\MSSQL\DATA\360518_DataFile_Data.MDF5850585Normal35,692

    sqlservr.exe984H:\MSSQL\DATA\390530_DataFile_Data.MDF7450745Normal32,394

    sqlservr.exe984H:\MSSQL\DATA\350514_DataFile_Data.MDF5460546Normal32,132

    sqlservr.exe984H:\MSSQL\DATA\Distrib_Data.MDF2,73102,731Normal32,073

    sqlservr.exe984I:\MSSQL\DATA\390528_DataFile2_Data.MDF6830683Normal31,629

    sqlservr.exe984H:\MSSQL\DATA\340510_DataFile_Data.MDF9109101,820Normal29,363

    sqlservr.exe984H:\MSSQL\DATA\390529_DataFile_Data.MDF0576,483576,483Normal29,250

    sqlservr.exe984I:\MSSQL\DATA\360518_DataFile2_Data.MDF1580158Normal27,877

    sqlservr.exe984I:\MSSQL\DATA\390529_DataFile2_Data.MDF227584,526584,754Normal27,203

    sqlservr.exe984I:\MSSQL\DATA\330506_DataFile2_Data.MDF0137137Normal27,197

  • gmcrouch (3/7/2012)


    get several event 833 errors in app log (IO request taking longer then 15 seconds)

    That screams IO subsystem problem. 🙁

    David

    @SQLTentmaker

    “He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

  • Brad McGehee posted a very informative set of slides on the transaction log, which were mentioned on this site yesterday and are accessible here http://bradmcgehee.com/wp-content/uploads/presentations/St%20Louis_Inside%20the%20SQL%20Server%20Transaction%20Log.pdf.

    Checking my assumptions at the door:

    How many databases has their log file on that drive?

    Only 1 log backup job is running at a time.

    I notice that in your LOGINFO post a number of the VLFs have the same CreateLSN

    2253952104857607482650648180000000457600592

    2253952107397127482660648180000000457600592

    2253952109936647482670648180000000457600592

    2286720112476167482680648180000000457600592

    I would suspect that your problem is related to the large number of quite small VLFs you have. My recommendation would be to

    1. Recreate your log files as per the recommendations on Kimberly Tripp's article. Fewer, larger VLFs should help

    2. Check the fragmentation on the log filesystem

    I suspect that you may have a lot of active VLFs. Check for any long running transactions too as these will cause VLFs not to get re-used appropriately.

  • I am going to try and truncate the transaction logs and resize them

    can someone suggest a size to start with for these DB's

    I was think 5 or 10MB

    need to get number of VLFs down 150+ is way too many

    Admin are ready to throw this server in the recycle bin.

  • 150 VLFs is not a huge amount. Still, if you want to get that number down, shrink the log to 0 then regrow to full size in a small number of operations (2 or 4).

    Full size, not 5 or 10 MB, and get the autogrow increments set to a sensible value for the log's size as well.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 14 posts - 16 through 28 (of 28 total)

You must be logged in to reply to this topic. Login to reply