High response times and queue lenthing duringlog back up

Question

High response times and queue lenthing duringlog back up

Viewing 14 posts - 16 through 28 (of 28 total)

You must be logged in to reply to this topic. Login to reply

ChrisTaylor SSCarpal Tunnel Points: 4527 More actions · Answer 1

are the log backup jobs for all databases running concurrently or sequentially?

has it always done this or has it just started happening? If recent, anything in the application/system log around the time it started?

As previously asked, are you using a 3rd party compression tool? ie. redgate sqlbackup, hyperbac, quest, idera etc.

_________________________________________________________________________________SQLGeordieWeb:- Jarrin ConsultancyBlog:- www.chrisjarrintaylor.co.ukTwitter:- @SQLGeordie

gmcrouch SSC-Addicted Points: 442 More actions · Answer 2

No we are use the standard SQL 2008 Log backups

transation logs are large 1 is 50GB the rested are about 20GB

David Benoit SSC-Dedicated Points: 34562 More actions · Answer 3

...and how frequently are you running them?

David

@SQLTentmaker

“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

gmcrouch SSC-Addicted Points: 442 More actions · Answer 4

Log backups run every 30 minutes

Full backups daily

David Benoit SSC-Dedicated Points: 34562 More actions · Answer 5

Wow - you have 50 GB of changes every 30 minutes on a given database and 20 GB on all others. That is a lot of data change in a short period of time and that is a lot of data to write to disk in a short period of time with a log backup. You could increase the frequency but that may not help things overall but will "spread" them out rather. Not sure if that would be a good thing though. I would be really interested to hear what you are doing that is generating that much change.

Side note here but if you are dumping that much log backup to disk every 30 minutes you are going to have to increase your throughput at the disk level in order for that to perform quickly and not cause an impact to your database server.

David

@SQLTentmaker

“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

ChrisTaylor SSCarpal Tunnel Points: 4527 More actions · Answer 6

gmcrouch (3/5/2012)
No we are use the standard SQL 2008 Log backups
transation logs are large 1 is 50GB the rested are about 20GB

Are you using SQL 2008 compressed backups? Also, can you confirm whether the actual log data you're backing up is that size or that's the physical size of the log file on disk? What i'm getting at is that the log file (.ldf) can be 50gb on disk but may only contain 10gb of "logged" data within the 30mins. Sorry but trying to put it in layman's terms as you've stated you're not a DBA

They could have been pre-sized at 50GB/20GB or possibly grown at some point to that size, if the latter is the case then 50GB/20GB of data change every 30mins is quite large and based on the info provided thats:

DB528: 50GB

DB529: 20GB

DB530: 20GB

DB531: 20GB

db506: 20GB

DB510: 20GB

DB514: 20GB

DB518: 20GB

which by my calculation is 190GB of data backed up every 30mins...... :w00t:

_________________________________________________________________________________SQLGeordieWeb:- Jarrin ConsultancyBlog:- www.chrisjarrintaylor.co.ukTwitter:- @SQLGeordie

David Benoit SSC-Dedicated Points: 34562 More actions · Answer 7

Along with Chris's note, can you run the following;

dbcc sqlperf (logspace) and provide the output.

Thanks.

David

@SQLTentmaker

“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

gmcrouch SSC-Addicted Points: 442 More actions · Answer 8

dbcc sqlperf (logspace)

Database NameLog Size (MB)Log Space Used (%)Status

master19.992192.0271590

tempdb15359.990.075289440

model19.9921912.758890

msdb49.992195.0378970

3305065075.11714.14790

3905308806.182.4855810

Distrib542.24222.0721970

34051018755.685.6197460

3505142587.74235.163210

3605185075.11725.759780

39052819999.990.20161140

39052916268.34.8302080

3905311916.05537.133160

You guys think it hardware almost everythings been changed by other techs including lots of drives

I'll start replacing erverything again Storage (HP MSA 50) Midplane and backplane replaced last night no change.

Next up controller cache and storage IO mod. Drives with Highest responds time are on storage (Data Drives .MDF)

Mother board has been replaced already can replace again.

I defraged drives seemed to help first back up. not much to back up stoped SQL for defrag SQl only up for 5 min before back up

but 2nd was back to Hi response long queue low IO.

I think it looks like a fragmentation or contention issue because of High response LOW IO.

Copy test from log drive to bak drive has High IO 150 MBs spikes over 200MB

David Benoit SSC-Dedicated Points: 34562 More actions · Answer 9

Based on that it doesn't look like your log backups will be that big. Another question, what performance measures are you seeing for avg disk sec / read and avg disk sec / write on the drive that you are writing to? They would be under the physical disk counters in performance monitor.

Thanks.

David

@SQLTentmaker

“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

gmcrouch SSC-Addicted Points: 442 More actions · Answer 10

I just finished replacing Controller HP p800 flashed to 7.50 with new cache

also MSA 50 IO mod has been changed

during non back up times

IO is between 50 and 200MB a second response time under 50MS Queue lenght under 5 ( Queue does spike to over 1000 sometimes for short period)

1 minute after back up starts

Response times and queue lenght jump and IO drops to about 1MB

Response over 15000MS Queue Lenght over 1500

get several event 833 errors in app log (IO request taking longer then 15 seconds)

higest reponse time is on reads to .MDF files but is high accross all disk

issue is still there.

here is resouce manager during last log backup (top 10 by response time)

ImagePIDFileRead (B/sec)Write (B/sec)Total (B/sec)I/O PriorityResponse Time (ms)

sqlservr.exe984H:\MSSQL\DATA\360518_DataFile_Data.MDF5850585Normal35,692

sqlservr.exe984H:\MSSQL\DATA\390530_DataFile_Data.MDF7450745Normal32,394

sqlservr.exe984H:\MSSQL\DATA\350514_DataFile_Data.MDF5460546Normal32,132

sqlservr.exe984H:\MSSQL\DATA\Distrib_Data.MDF2,73102,731Normal32,073

sqlservr.exe984I:\MSSQL\DATA\390528_DataFile2_Data.MDF6830683Normal31,629

sqlservr.exe984H:\MSSQL\DATA\340510_DataFile_Data.MDF9109101,820Normal29,363

sqlservr.exe984H:\MSSQL\DATA\390529_DataFile_Data.MDF0576,483576,483Normal29,250

sqlservr.exe984I:\MSSQL\DATA\360518_DataFile2_Data.MDF1580158Normal27,877

sqlservr.exe984I:\MSSQL\DATA\390529_DataFile2_Data.MDF227584,526584,754Normal27,203

sqlservr.exe984I:\MSSQL\DATA\330506_DataFile2_Data.MDF0137137Normal27,197

David Benoit SSC-Dedicated Points: 34562 More actions · Answer 11

gmcrouch (3/7/2012)
get several event 833 errors in app log (IO request taking longer then 15 seconds)

That screams IO subsystem problem. 🙁

David

@SQLTentmaker

“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

Toby Harman SSCarpal Tunnel Points: 4168 More actions · Answer 12

Brad McGehee posted a very informative set of slides on the transaction log, which were mentioned on this site yesterday and are accessible here http://bradmcgehee.com/wp-content/uploads/presentations/St%20Louis_Inside%20the%20SQL%20Server%20Transaction%20Log.pdf.

Checking my assumptions at the door:

How many databases has their log file on that drive?

Only 1 log backup job is running at a time.

I notice that in your LOGINFO post a number of the VLFs have the same CreateLSN

2253952104857607482650648180000000457600592
2253952107397127482660648180000000457600592
2253952109936647482670648180000000457600592
2286720112476167482680648180000000457600592

I would suspect that your problem is related to the large number of quite small VLFs you have. My recommendation would be to

1. Recreate your log files as per the recommendations on Kimberly Tripp's article. Fewer, larger VLFs should help

2. Check the fragmentation on the log filesystem

I suspect that you may have a lot of active VLFs. Check for any long running transactions too as these will cause VLFs not to get re-used appropriately.

gmcrouch SSC-Addicted Points: 442 More actions · Answer 13

I am going to try and truncate the transaction logs and resize them

can someone suggest a size to start with for these DB's

I was think 5 or 10MB

need to get number of VLFs down 150+ is way too many

Admin are ready to throw this server in the recycle bin.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 14

150 VLFs is not a huge amount. Still, if you want to get that number down, shrink the log to 0 then regrow to full size in a small number of operations (2 or 4).

Full size, not 5 or 10 MB, and get the autogrow increments set to a sensible value for the log's size as well.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass