Tempdb a total mess

  • Hi,

    I'm analyzing a server 2008 R2, build RTM...

    I was looking at the sys.dm_io_virtual_file_stats and got scared with the tempdb write stalls... io_stall_write_ms/num_of_writes gives 1080ms.. 1 second average wait to write to tempdb... The read is 10ms on tempdb... The other databases have 9ms read and 10ms to 30ms write stalls...

    The tempdb has 4 files, 2GB each, on C drive along with the other databases. It's a RAID5 (I know, bad for writes, not the best for SQL but they already had this EqualLogic storage where all the VMs, yep SQL is runing on a VM...., are stored)...

    There are lots of SQL Server jobs and IIS services (on another machine) that interact with the ERP, that uses a lot of tempdb... but 1sec write wait stall isn't normal...

    I also noticed that the ERP database, with 200GB, has RCSI activated... can this be the source of the problem?

    I activated the AdHoc workload on SQL, Maxdop 4 (it has a Six-Core AMD Opteron(tm) Processor 2435, but only 4 are dedicated to the SQL VM) and threshold for parallelism is set to 30....

    What can I do the analyze deeper what's the problem with tempdb? Should I remove 2 files and only have 2 on tempdb?

    I already told them to update to SP3, since RTM is outdated...

    Thanks,

    Pedro



    If you need to work better, try working less...

  • Contact your VM admin and ask to check DISK I/O for the drive on which you do have issue. Prefer to perform DISK I/O check on all the drives.

    ---------------------------------------------------
    "Thare are only 10 types of people in the world:
    Those who understand binary, and those who don't."

  • Only tempdb has that problem and all the databases are on the same SAN...

    The SAN has one big RAID5 partition that's splitted into several "disks". The VM is in one of those disks with all the databases, but only tempdb has the problem.



    If you need to work better, try working less...

  • Yes, I do understand but unless and until I/O test perform and result is out it is difficult to predict.

    ---------------------------------------------------
    "Thare are only 10 types of people in the world:
    Those who understand binary, and those who don't."

  • PiMané (1/20/2015)


    Hi,

    I'm analyzing a server 2008 R2, build RTM...

    I was looking at the sys.dm_io_virtual_file_stats and got scared with the tempdb write stalls... io_stall_write_ms/num_of_writes gives 1080ms.. 1 second average wait to write to tempdb... The read is 10ms on tempdb... The other databases have 9ms read and 10ms to 30ms write stalls...

    The tempdb has 4 files, 2GB each, on C drive along with the other databases. It's a RAID5 (I know, bad for writes, not the best for SQL but they already had this EqualLogic storage where all the VMs, yep SQL is runing on a VM...., are stored)...

    There are lots of SQL Server jobs and IIS services (on another machine) that interact with the ERP, that uses a lot of tempdb... but 1sec write wait stall isn't normal...

    I also noticed that the ERP database, with 200GB, has RCSI activated... can this be the source of the problem?

    I activated the AdHoc workload on SQL, Maxdop 4 (it has a Six-Core AMD Opteron(tm) Processor 2435, but only 4 are dedicated to the SQL VM) and threshold for parallelism is set to 30....

    What can I do the analyze deeper what's the problem with tempdb? Should I remove 2 files and only have 2 on tempdb?

    I already told them to update to SP3, since RTM is outdated...

    Thanks,

    Pedro

    Pedro,

    I would suggest a couple of things. First, you can dig into the actual transactions to see which ones are slow. This can tell you, for example, if the ERP database is performing badly compared to others. You could do this investigation with Profiler. You could also query the EXEC_Sessions and Exec_Requests DMVs, and possibly others to get similar information. You could go to BrentOzar.com and get sp_AskBrent to install and run that to capture data. You could go to SQLSkills.com and Paul Randal has some blogs on gathering IO info from sys.dm_io_virtual_file_stats and analyzing it. So, there are a variety of ways to capture the statements running and corresponding durations to narrow down where the problem is.

    I would also say that suspecting RCSI as a contributing factor is probably good. TempDB will have to maintain row versions for modified data, increasing IO and the size of TempDB.

    https://technet.microsoft.com/en-us/library/ms188277(v=sql.105).aspx

    See especially the section on "Costs of Row Versioning-Based Isolation Levels."

  • I've done that and nothing out of the ordinary...

    I'll turn off RCSI to see how it goes...

    All the databases are on the same disk so IO performance is not the real issue here.. Tempdb has loads of reads/writes probably due to RCSI and row versioning... It'd be better an SSD for tempdb in these cases.

    I'm monitoring the queries with server traces (profiler is very "heavy" on the server) and will check what going on the server... The truth is that before RCSI there were a lot of dead locks... but all the databases had the same stall times. now, with RCSI, there are almost no deadlocks but tempdb is struggling..

    Thanks

    Pedro



    If you need to work better, try working less...

  • Kicking in open doors, but why is this SQL2008R2 instance still on the RTM build ? ( April 21, 2010 )

    Current is SP3 !

    How to obtain the latest service pack for SQL Server 2008 R2

    Having read the last line in your OP, let me support you stating Service packs do contain more than "just" bug fixes !

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • I've been taking "snapshots" from tempdb virtual file stats and found that when users are using the ERP tempdb's times drop to 20ms / write.

    When some sql jobs run those times go up to 1100ms. Some jobs include index rebuild (on tempdb), large data's volume processing and other "heavy" stuff.

    I'll look deeper into those jobs to see what really goes on and hopefully "normal" EPR operations won't increase the times.

    Thanks,

    Pedro



    If you need to work better, try working less...

  • I'd get those tempdb files off the C: drive or off RAID5 asap.

    RAID10 if you can.

    Good article here from Denny Cherry.

    qh

    [font="Tahoma"]Who looks outside, dreams; who looks inside, awakes. – Carl Jung.[/font]

Viewing 9 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply