Poor SQL Performance

Question

Poor SQL Performance

j21283

SSC-Addicted

Points: 454
More actions
May 25, 2016 at 10:24 am

#327263

I'm hoping some folks might be able to help out with a poor performance situation for our SQL server.
We recently did a major system upgrade from SQL 2008 and Windows 2008 and Xenserver to:
SQL 2012 R2 (build 11.0.6020) 21GB max memory, 24GB total
Windows Server 2012 R2
Xenserver 6.5 (build 6.5.2.2477)
After upgrading we are getting user complaints and noticeable slowness\time outs at our more peak times of use. Looking at DMV's most of our wait times center around the following:
BACKUPIO
WRITELOG
ASYNC IO COMPLETION
Looking at the virtual file stats DMV I am noticing read and write latency across multiple files:
Master DB Log File - Read Latency 191ms, Write Latency 8ms
Temp DB Data File - Read Latency 21 ms, Write Latency 98ms
Production DB Data File - Read Latency 121ms, Write Latency 38ms
Production DB Log File - Ready Latency 331ms, Write Latency 12ms
The data and log files are on separate RAID 10 disk groups on the SAN - however the tempdb currently resides in the same disk group as the data drive does - however we don't have another disk group to allocate tempdb to - only other option would be the disk group where all the server OS's are located.
I'm looking for some advice to narrow down where the problem might be at - whether SQL or SAN/Citrix Performance Issues.
Thanks,

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 1

j21283 (5/25/2016)
I'm hoping some folks might be able to help out with a poor performance situation for our SQL server.
We recently did a major system upgrade from SQL 2008 and Windows 2008 and Xenserver to:
SQL 2012 R2 (build 11.0.6020) 21GB max memory, 24GB total
Windows Server 2012 R2
Xenserver 6.5 (build 6.5.2.2477)
After upgrading we are getting user complaints and noticeable slowness\time outs at our more peak times of use. Looking at DMV's most of our wait times center around the following:
BACKUPIO
WRITELOG
ASYNC IO COMPLETION
Looking at the virtual file stats DMV I am noticing read and write latency across multiple files:
Master DB Log File - Read Latency 191ms, Write Latency 8ms
Temp DB Data File - Read Latency 21 ms, Write Latency 98ms
Production DB Data File - Read Latency 121ms, Write Latency 38ms
Production DB Log File - Ready Latency 331ms, Write Latency 12ms
The data and log files are on separate RAID 10 disk groups on the SAN - however the tempdb currently resides in the same disk group as the data drive does - however we don't have another disk group to allocate tempdb to - only other option would be the disk group where all the server OS's are located.
I'm looking for some advice to narrow down where the problem might be at - whether SQL or SAN/Citrix Performance Issues.
Thanks,

I can definitively declare that your IO performance SUCKS. I can't, however, tell you WHY. 😎

With a SAN and VM in play it could be a BUNCH of things. Do you know EXACTLY what underlies that RAID10 set you are using? If not find out. What else resides on the same spindles. Did you test the disk speed (or did the SAN people)?

Why just 24GB RAM?? You seem have more than 24GB database, and SQL 2012 Standard Edition can use 64GB. Maxing out RAM is the single easiest and very inexpensive way to help IO performance.

What types of total numbers are you seeing on file IO stalls in a differential analysis?

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

j21283 SSC-Addicted Points: 454 More actions · Answer 2

Hi Kevin,

Thanks for your response - this was where I was leaning as well - I will say before the upgrade of these various components our performance seemed to be better - although I don't have actual numbers the responsive of our application seemed much better before the switchover.

We are actually currently investigating purchasing new hardware as we are about at the end of our life cycle for our SAN and Virtual Server Physical Hosts.

Our SAN is the following: PowerVault MD3200 iSCSI SAN Storage - dual controller - each controller has 4 1Gb Ethernet ports.

There is actually nothing else on those same spindles - these disks are only used for SQL and nothing else - which is why the performance is so surprising.

Our database is about 40GB in size - so I could see if we have additional memory on the virtual host we could allocate to help alleviate the pressure for now. We don't have a dedicated person for our SAN or someone who knows much about it - it was contracted with some folks about 5 years ago - so I'd have to contact a consultant to have them take a look - so just trying to gather data to make the case for spending some money.

How would I go about the IO stall differential analysis?

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 3

j21283 (5/25/2016)
Hi Kevin,
Thanks for your response - this was where I was leaning as well - I will say before the upgrade of these various components our performance seemed to be better - although I don't have actual numbers the responsive of our application seemed much better before the switchover.
We are actually currently investigating purchasing new hardware as we are about at the end of our life cycle for our SAN and Virtual Server Physical Hosts.
Our SAN is the following: PowerVault MD3200 iSCSI SAN Storage - dual controller - each controller has 4 1Gb Ethernet ports.
There is actually nothing else on those same spindles - these disks are only used for SQL and nothing else - which is why the performance is so surprising.
Our database is about 40GB in size - so I could see if we have additional memory on the virtual host we could allocate to help alleviate the pressure for now. We don't have a dedicated person for our SAN or someone who knows much about it - it was contracted with some folks about 5 years ago - so I'd have to contact a consultant to have them take a look - so just trying to gather data to make the case for spending some money.
How would I go about the IO stall differential analysis?

This company sounds like my typical client. 😀

Take a snapshot of file IO stall dmv, wait a while (I usually do 3 or 5 mins during a period of interesting activity), take another snapshot and diff them and divide by time. Voila - average ms per IO.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

j21283 SSC-Addicted Points: 454 More actions · Answer 4

Ah ok thought I was perhaps missing something for the analysis - I've ran a few scripts related to this and this is what we get:

TempDB avg io stall ms: 65.5 ms

Prod avg io stall ms: 94.3ms

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 5

j21283 (5/25/2016)
Ah ok thought I was perhaps missing something for the analysis - I've ran a few scripts related to this and this is what we get:
TempDB avg io stall ms: 65.5 ms
Prod avg io stall ms: 94.3ms

Not really helpful there, right? If those numbers are for 20 reads/writes who cares. If they are for 20000 in 180 seconds I care a LOT because the total slowdown (20000*94ms) is a telephone number. Also you must always split out reads and writes.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

j21283 SSC-Addicted Points: 454 More actions · Answer 6

Ah ok that makes sense to track seperate - after benchmarking for a bit today if I'm calculating that right it doesn't seem like it is that bad right now? I've attached a spreadsheet with my calculations.

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 7

Again not enough information. What was the monitoring interval? If it was 30 seconds you are doing very poorly. If it was 180 seconds then you are waiting one second of clock time for IO per second of app time. That's not good in my book, especially since writes are high. If you monitored for 10 minutes then not too bad on the whole but things could still seem slowish if the actual write stalls were clustered together.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

j21283 SSC-Addicted Points: 454 More actions · Answer 8

Hi Kevin,

The elapsed_time column is the interval between the baseline I took and the next polling. So the first few lines in the excel sheet would be about 6 minutes of polling time it looks like.

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 9

So about 400ms IO waits per 1 second clock time. I call that not good. If your perf is acceptable then maybe you can work on other things.

Have you tuned your queries and your system(s)? IO slowness is often a result of poor structures, poor indexing, poor code, poor maintenance, poorly configured hardware/VM, etc, etc.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

AlexSQLForums SSChampion Points: 14277 More actions · Answer 10

j21283 (5/25/2016)
I'm hoping some folks might be able to help out with a poor performance situation for our SQL server.
We recently did a major system upgrade from SQL 2008 and Windows 2008 and Xenserver to:
SQL 2012 R2 (build 11.0.6020) 21GB max memory, 24GB total
Windows Server 2012 R2
Xenserver 6.5 (build 6.5.2.2477)
After upgrading we are getting user complaints and noticeable slowness\time outs at our more peak times of use. Looking at DMV's most of our wait times center around the following:
BACKUPIO
WRITELOG
ASYNC IO COMPLETION
Looking at the virtual file stats DMV I am noticing read and write latency across multiple files:
Master DB Log File - Read Latency 191ms, Write Latency 8ms
Temp DB Data File - Read Latency 21 ms, Write Latency 98ms
Production DB Data File - Read Latency 121ms, Write Latency 38ms
Production DB Log File - Ready Latency 331ms, Write Latency 12ms
The data and log files are on separate RAID 10 disk groups on the SAN - however the tempdb currently resides in the same disk group as the data drive does - however we don't have another disk group to allocate tempdb to - only other option would be the disk group where all the server OS's are located.
I'm looking for some advice to narrow down where the problem might be at - whether SQL or SAN/Citrix Performance Issues.
Thanks,

What kind of upgrade was this?

Backup/Restore?

Upgrade Wizard?

Alex S

j21283 SSC-Addicted Points: 454 More actions · Answer 11

Ah ok I see how you got to that calculation now - I think with this information I'm going to see if I can get a resource to look at the Xen configuration and SAN items - I have a feeling this issue lies in the way our updated Xen environment is configured as the latency seemed better on the older SQL/OS/Xen platform - we do index/stat maintenance and also look for high read/write queries to try and tamper them down.

Thanks for your help!

j21283 SSC-Addicted Points: 454 More actions · Answer 12

Hi Alex - we provisioned an entirely new server and installed SQL 2012 fresh and then backed up and restored the databases to the new server.