Virtualised SQL 2008 Performance Problem

  • Hiya,

    Any help on the following is appreciated. I've got a production system with the following spec.

    It is running slowly and I don't know what to believe about it any more...

    The Server is virtualised (vmware)

    The SQL VM is SQL 2008 R2 Enterprise on Windows 2003 R2 x64 Enterprise

    It has 16GB of RAM. I have allocated 3GB Minimum, 14GB Maximum to SQL/

    From Perfmon I have taken the following -

    Working Set 70766593

    SQL Server Memory Manager - Total Server Memory (KB) 3072000 - 3GB

    SQL Server Buffer Manager - Total Pages 384000

    SQL Server Buffer Manager - Buffer Cache Hit Ratio 99.976

    Memory - Pages/sec typical 382 Max 2401

    I have 'Locked pages in memory' set (running under a domain account)

    As it is on 64 bit I have not used any boot.ini switches as I understood them to be 32bit only.

    I have the following in the log file on startup

    - Using dynamic lock allocation. Initial allocation of 2500 Lock blocks and 5000 Lock Owner blocks per node. This is an informational message only. No user action is required.

    - Using locked pages for buffer pool.

    - Large Page Allocated: 32MB

    - Large Page Granularity: 2097152

    - Large Page Extensions enabled.

    From 2am I also have an I/O issue in the log that I need to approach the VMWare guy with -

    SQL Server has encountered 4312 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [S:\SQLData\tempdb.mdf] in database [tempdb] (2). The OS file handle is 0x000000000000062C. The offset of the latest long I/O is: 0x00000094780000

    Further to this I have used SQLIOSim the results from which are pasted at the foot of this post.

    The L: drive (logs) is perfroming much better than the S: drive (data)

    So to summarise, I'm struggling with -

    - Slowness

    - Task Manager is reporting sqlservr.exe sing only 69,764 K (a red herring?)

    - The scary Pages/sec figure

    - What else should I look at here? (or what am I missing).

    I dont have access to the VMWare environment, but any pointers on SQL config / VMWare what else people would look for would be appreciated.

    (The drives are formatted and aligned correctly in the VM).

    Thank you

    S

    SQLIOSIM O/P >

    11/28/12 09:55:31 340 CreateFileStream Expanding file L:\sqliosim.ldx from 0MB to 50MB CLogicalFile::ExpandCollapse e:\sqltdbms\storeng\util\sqliosim\fileio.cpp 1097

    11/28/12 09:55:31 6412 CreateFileStream Expanding file S:\sqliosim.mdx attempt COMPLETE CLogicalFile::ExpandCollapse e:\sqltdbms\storeng\util\sqliosim\fileio.cpp 1207

    11/28/12 09:55:31 340 CreateFileStream Expanding file L:\sqliosim.ldx attempt COMPLETE CLogicalFile::ExpandCollapse e:\sqltdbms\storeng\util\sqliosim\fileio.cpp 1207

    11/28/12 09:55:31 6920 System Starting test cycle CSQLIOSimView::OnWndMsg e:\sqltdbms\storeng\util\sqliosim\sqliosimview.cpp 238

    11/28/12 09:55:31 5968 System Created buffer pool. Size=1039 MB, buffers 132992, locked pages disabled. CBufferPool::hrCreateBufferPool e:\sqltdbms\storeng\util\sqliosim\buffer.cpp 1205

    11/28/12 09:55:34 6968 Overall Test Progress Performing initial Update Scan CTestCycle::EntryPoint e:\sqltdbms\storeng\util\sqliosim\testusers.cpp 1855

    11/28/12 09:55:34 6968 Overall Test Progress Checkpointing initial Update Scan CTestCycle::EntryPoint e:\sqltdbms\storeng\util\sqliosim\testusers.cpp 1861

    11/28/12 09:55:35 6968 Overall Test Progress Performing Full Test iteration #1 CTestCycle::EntryPoint e:\sqltdbms\storeng\util\sqliosim\testusers.cpp 1873

    11/28/12 10:00:36 6968 Overall Test Progress Performing final Checkpoint #1 CTestCycle::EntryPoint e:\sqltdbms\storeng\util\sqliosim\testusers.cpp 1886

    11/28/12 10:00:36 6968 Overall Test Progress Performing final Audit Scan #1 CTestCycle::EntryPoint e:\sqltdbms\storeng\util\sqliosim\testusers.cpp 1896

    11/28/12 10:00:37 6920 System Test cycle complete CSQLIOSimView::OnWndMsg e:\sqltdbms\storeng\util\sqliosim\sqliosimview.cpp 246

    11/28/12 10:00:37 5968 System Stopping simulation Simulator_Main e:\sqltdbms\storeng\util\sqliosim\simulator_main.cpp 407

    11/28/12 10:00:37 1220 Display Monitor Closing file S:\sqliosim.mdx CLogicalFile::~CLogicalFile e:\sqltdbms\storeng\util\sqliosim\fileio.cpp 245

    11/28/12 10:00:37 1220 Display Monitor ********** Final Summary for file S:\sqliosim.mdx ********** CLogicalFile::OutputSummary e:\sqltdbms\storeng\util\sqliosim\fileio.cpp 517

    11/28/12 10:00:37 1220 Display Monitor File Attributes: Compression = No, Encryption = No, Sparse = No CLogicalFile::OutputSummary e:\sqltdbms\storeng\util\sqliosim\fileio.cpp 521

    11/28/12 10:00:37 1220 Display Monitor Target IO Duration (ms) = 100, Running Average IO Duration (ms) = 86, Number of times IO throttled = 1695, IO request blocks = 61 CLogicalFile::OutputSummary e:\sqltdbms\storeng\util\sqliosim\fileio.cpp 529

    11/28/12 10:00:37 1220 Display Monitor Reads = 12680, Scatter Reads = 7808, Writes = 58, Gather Writes = 6857, Total IO Time (ms) = 1102556 CLogicalFile::OutputSummary e:\sqltdbms\storeng\util\sqliosim\fileio.cpp 540

    11/28/12 10:00:37 1220 Display Monitor DRIVE LEVEL: Sector size = 512, Cylinders = 26108, Media type = 12, Sectors per track = 63, Tracks per Cylinders = 255 CLogicalFile::OutputSummary e:\sqltdbms\storeng\util\sqliosim\fileio.cpp 562

    11/28/12 10:00:37 1220 Display Monitor DRIVE LEVEL: Read cache enabled = No, Write cache enabled = No CLogicalFile::OutputSummary e:\sqltdbms\storeng\util\sqliosim\fileio.cpp 576

    11/28/12 10:00:37 1220 Display Monitor DRIVE LEVEL: Read count = 20607, Read time = 332704, Write count = 7253, Write time = 805235, Idle time = 175806, Bytes read = 3228794880, Bytes written = 1655392256, Split IO Count = 0, Storage number = 6, Storage manager name = FTDISK CLogicalFile::OutputSummary e:\sqltdbms\storeng\util\sqliosim\fileio.cpp 587

    11/28/12 10:00:37 1220 Display Monitor Closing file L:\sqliosim.ldx CLogicalFile::~CLogicalFile e:\sqltdbms\storeng\util\sqliosim\fileio.cpp 245

    11/28/12 10:00:37 1220 Display Monitor ********** Final Summary for file L:\sqliosim.ldx ********** CLogicalFile::OutputSummary e:\sqltdbms\storeng\util\sqliosim\fileio.cpp 517

    11/28/12 10:00:37 1220 Display Monitor File Attributes: Compression = No, Encryption = No, Sparse = No CLogicalFile::OutputSummary e:\sqltdbms\storeng\util\sqliosim\fileio.cpp 521

    11/28/12 10:00:37 1220 Display Monitor Target IO Duration (ms) = 100, Running Average IO Duration (ms) = 41, Number of times IO throttled = 7, IO request blocks = 8 CLogicalFile::OutputSummary e:\sqltdbms\storeng\util\sqliosim\fileio.cpp 529

    11/28/12 10:00:37 1220 Display Monitor Reads = 0, Scatter Reads = 0, Writes = 141682, Gather Writes = 0, Total IO Time (ms) = 1196037 CLogicalFile::OutputSummary e:\sqltdbms\storeng\util\sqliosim\fileio.cpp 540

    11/28/12 10:00:37 1220 Display Monitor DRIVE LEVEL: Sector size = 512, Cylinders = 26108, Media type = 12, Sectors per track = 63, Tracks per Cylinders = 255 CLogicalFile::OutputSummary e:\sqltdbms\storeng\util\sqliosim\fileio.cpp 562

    11/28/12 10:00:37 1220 Display Monitor DRIVE LEVEL: Read cache enabled = No, Write cache enabled = No CLogicalFile::OutputSummary e:\sqltdbms\storeng\util\sqliosim\fileio.cpp 576

    11/28/12 10:00:37 1220 Display Monitor DRIVE LEVEL: Read count = 15, Read time = 271, Write count = 143313, Write time = 1637869, Idle time = 28638, Bytes read = 5061632, Bytes written = 8729159168, Split IO Count = 0, Storage number = 5, Storage manager name = FTDISK CLogicalFile::OutputSummary e:\sqltdbms\storeng\util\sqliosim\fileio.cpp 587

    11/28/12 10:00:37 1220 Display Monitor Cleaning up buffer pool CBufferPool::DestroyBufferPool e:\sqltdbms\storeng\util\sqliosim\buffer.cpp 1226

    11/28/12 10:00:37 1220 Display Monitor Buffer Pool: validated buffers 203728, pages 0, discarded buffers 0 CBufferPool::DestroyBufferPool e:\sqltdbms\storeng\util\sqliosim\buffer.cpp 1230

    11/28/12 10:00:37 5968 System Simulator Stress Test Attempt Complete

  • It sounds like a disk issue. How is your storage subsystem setup, and where are data, log and tempdb files stored?

    What do you see from the sys.dm_os_waiting_tasks DMV?

    Joie Andrew
    "Since 1982"

  • data and tempdb are on S:

    log are on L:

    I ran -

    select wait_duration_ms , wait_type, blocking_session_id from sys.dm_os_waiting_tasks order by 1 desc

    2512536962ONDEMAND_TASK_QUEUENULL

    2512536575BROKER_TRANSMITTERNULL

    2512536575BROKER_TRANSMITTERNULL

    2512530889KSOURCE_WAKEUPNULL

    2512530880BROKER_EVENTHANDLERNULL

    26978702XE_DISPATCHER_WAITNULL

    906459CHECKPOINT_QUEUENULL

    34778DISPATCHER_QUEUE_SEMAPHORENULL

    8612XE_TIMER_EVENTNULL

    4828REQUEST_FOR_DEADLOCK_SEARCHNULL

    3910SQLTRACE_INCREMENTAL_FLUSH_SLEEPNULL

    996BROKER_TO_FLUSHNULL

    462LAZYWRITER_SLEEPNULL

    211LOGMGR_QUEUENULL

    thanks

    s

  • If you're looking at disk problems, for once, you can leave the VM guy alone and concentrate on the physical box. You may need to wake up the SAN guy though. It sounds like you either have a hardware issue on the disks, or, possibly, some tuning opportunities in the queries to reduce the amount of I/O that they're doing. Both are also a possibility.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Thinking about it some more, take a look at sys.dm_os_wait_stats. That's going to tell you, in aggregate, what is causing your system to run slow. That will indicate where the issues lie. If it suggests memory or cpu as a root cause, you will need to go back to your VM administrator. If it's disks, it's hardware or queries or both.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Thank you Grant .

    (I saw you present at SQL in the City - London actually!)

    Using Glenn Berry's query i can confirm it's all pointing back at I/O >

    wait_type wait_time_spctrunning_pct

    ASYNC_IO_COMPLETION4098.0420.2020.20

    BACKUPBUFFER 3555.0717.5237.72

    BACKUPIO 2620.4912.9250.63

    OLEDB 2458.1112.1162.75

    PREEMPTIVE_OS_AUTHENTICATIONOPS1534.907.5670.31

    WRITELOG 1344.436.6376.94

    LCK_M_U 1231.486.0783.01

    BACKUPTHREAD 1188.015.8688.86

    PAGEIOLATCH_SH 733.113.6192.48

    IO_COMPLETION 407.072.0194.48

    PREEMPTIVE_OS_CRYPTACQUIRECONTEXT239.831.1895.67

    PAGEIOLATCH_EX 104.760.5296.18

    SOS_SCHEDULER_YIELD94.740.4796.65

    LCK_M_S 81.420.4097.05

    SLEEP_BPOOL_FLUSH65.740.3297.37

    PREEMPTIVE_OS_AUTHORIZATIONOPS63.580.3197.69

    PREEMPTIVE_OS_DELETESECURITYCONTEXT52.420.2697.95

    PREEMPTIVE_OS_WRITEFILE49.480.2498.19

    MSQL_XP49.41 0.2498.43

    PREEMPTIVE_OS_GETPROCADDRESS49.150.2498.68

    WRITE_COMPLETION48.760.2498.92

    PREEMPTIVE_OS_CRYPTIMPORTKEY37.650.1999.10

    Why does Task Manager report sqlservr.exe as using only 69,764K?

    Is this a red herring?

    Thanks

    S

  • It's possible. Once you're in a VM, you can't trust those numbers. However, I'd check to see that you have Max memory set appropriately so that it is using what the VM is giving it.

    Other than that, you're dealing with I/O issues. Time to kick the SAN guy good & hard.

    And thanks for coming out to SQL in the City. London was a total blast. I love presenting over there.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • sqlbanana (11/29/2012)


    data and tempdb are on S:

    log are on L:

    I ran -

    select wait_duration_ms , wait_type, blocking_session_id from sys.dm_os_waiting_tasks order by 1 desc

    2512536962ONDEMAND_TASK_QUEUENULL

    2512536575BROKER_TRANSMITTERNULL

    2512536575BROKER_TRANSMITTERNULL

    2512530889KSOURCE_WAKEUPNULL

    2512530880BROKER_EVENTHANDLERNULL

    26978702XE_DISPATCHER_WAITNULL

    906459CHECKPOINT_QUEUENULL

    34778DISPATCHER_QUEUE_SEMAPHORENULL

    8612XE_TIMER_EVENTNULL

    4828REQUEST_FOR_DEADLOCK_SEARCHNULL

    3910SQLTRACE_INCREMENTAL_FLUSH_SLEEPNULL

    996BROKER_TO_FLUSHNULL

    462LAZYWRITER_SLEEPNULL

    211LOGMGR_QUEUENULL

    thanks

    s

    Are these disks vmware virtual disks or RDM's.

    If VMDKs, are the virtual disks all on the same datastore?

    Random IO sucks through a virtual disk at the best of times, throw into the mix an overcommitted data store and your performace will be zero.

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

Viewing 8 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply