Storage Subsystems and RAMDisks

  • We're setting up a new database server for our OLTP and smaller website databases, and disk performance is a big area of concern. We have 3 disk arrays available to us, and 64GB of RAM.

    The 3 storage systems are:

    Inline SAS HDD (2-disks 15K)

    DAS SAS HDD (4-disks 15K)

    DAS SAS SSD (8-disks)

    On these 3 storage arrays, we have 6 main types of storage.

    System Data Files (master, msdb, model, distribution)

    TempDb Data Files (8 data files for 8 cores)

    OLTP Data Files

    OLTP Data Indexes (separate file)

    Website Data Files (much lower use than OLTP)

    All The Log Files (System, OLTP, Web, etc)

    We planned on distributing our files on the storage systems like this:

    Inline HDD (OS) – Don’t want much here because it houses each of the OS’s

    - System Data Files

    - TempDb Data File (1 of 8 data files)

    MD HDD (4-disk)

    - All The Log Files

    - TempDb Data Files (3 of 8 data files)

    MD SDD (8-disk)

    - OLTP Data Files

    - OLTP Data Indexes

    - Website Data Files

    - TempDb Data Files (4 of 8 data files)

    Analyzing our current system, the largest I/O requirements are for TempDb, followed by OLTP Data files - so those are our main focus. We are looking into the option of allocating 8GB more RAM to the server to create a RAMDisk (not taking away any of the 64GB already allocated to SQL). The RAMDisk would be used entirely for 4 of the 8 TempDb data files, isolating most of the TempDb work in RAM. Then rearrange the rest a bit to better isolate TempDb and OLTP data files:

    Inline HDD (OS) – Don’t want much here because it houses each of the OS’s

    - System Data Files

    RAMDisk (8GB)

    - TempDb Data Files (4 of 8 data files - 2GB each)

    MD HDD (4-disk)

    - All The Log Files

    - Website Data Files

    MD SSD (8-disk)

    - OLTP Data Files

    - OLTP Data Indexes

    - TempDb Data Files (4 of 8 data files - 2GB each)

    To me, this solution looks much better, and from what I read we should get some major performance boosts as a result. There are lots of guides out there showing how to use RAMDisks for TempDb performance improvements, but other experts say to stick to the SQL Recommended Best Practices which doesn't include RAMDisks.

    My question is, has anyone here used RAMDisks for TempDb, and what are their thoughts and experience?

  • I wouldn't bother using a RAM disk for tempdb.

    Most of what accesses tempdb will already be perfectly happy to live in RAM till it runs out and has to use the disk. So the gains are likely to be minor, if they exist at all. Table variables, temp tables, and worktables, only spill out of RAM onto drives, if they have to because of lack of available RAM, per MS.

    So, test it, and if you find it really gives you the boost you need, go for it. But definitely test thoroughly and don't be surprised if the benefit is hard to isolate or nonexistent.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (12/20/2012)


    Table variables, temp tables, and worktables, only spill out of RAM onto drives, if they have to because of lack of available RAM, per MS.

    I'd heard temporary tables, unlike variables, were always stored in TempDb's data files and not in RAM - do you have any links or articles that explain this that I could read? From what I can find it looks like we're both wrong, it looks like both table variables and temporary tables are always stored in TempDb and not in memory.

    http://cybersql.blogspot.ca/2009/12/yes-table-variables-and-temp-tables.html

    We've noticed in our live environment that the most active database for disk i/o in our system is TempDb, it actually noses out our OLTP. This makes sense if the article I linked to is true, and temporary tables and table variables are always written to the TempDb data files. We didn't think this was because of a lack of RAM (currently 32GB for our 35GB OLTP database), but I guess we'll know when we move to the 64GB system and monitor TempDb usage there.

  • GSquared (12/20/2012)


    I wouldn't bother using a RAM disk for tempdb.

    Most of what accesses tempdb will already be perfectly happy to live in RAM till it runs out and has to use the disk. So the gains are likely to be minor, if they exist at all. Table variables, temp tables, and worktables, only spill out of RAM onto drives, if they have to because of lack of available RAM, per MS.

    I do agree that in most cases (this one in particular), putting tempdb on a RAMDisk is a waste of RAM. In a few cases (i.e. tempdb maximum size is smaller than the amount of RAM doing nothing), then it could help tremendously, since some worktables spill to RAM based on SQL incorrectly estimating data size, and in particular the tempdb log file does get a fair number of writes in many of our cases.

    I am, however, extremely puzzled by your tempdb data files mixed between different storage speeds. That should lead to either very inconsistent performance, or consistent performance at the very slowest speed, neither of which is beneficial. Why split tempdb between different types of storage? What tradeoffs are you expecting?

  • My plan with splitting tempdb across three different storage systems was because I didn't have any fast storage to isolate the files on their own. My understanding is that SQL would use the tempdb files in round robin, and while I assume you're correct that some tempdb files would perform faster than others, several concurrent processes would be able to access tempdb data using the three storage systems simultaneously. Since tempdb would be in contention with other I/O on each storage system.

    I think with the RAMDisk solution, it would probably be best to put all of the tempdb files on the RAMDisk together - instead of split with the SSD like I originally posted. What are your thoughts, am I misunderstanding something?

  • One good article on temp and variable: http://www.simple-talk.com/sql/t-sql-programming/temporary-tables-in-sql-server/

    Answer from Microsoft on where (RAM vs disk) temp tables and table variables are created: http://support.microsoft.com/kb/305977

    Relevant quote (from Q4):

    If memory is available, both table variables and temporary tables are created and processed while in memory (data cache).

    Hence my point that leaving the RAM available for the cache is probably a better use than a RAM-disk.

    Puting the log files for tempdb on a RAM disk might have some performance improvements, but it is again unlikely because of how caching works. It would definitely render the data non-ACID, but tempdb pretty much already does that anyway, so no loss there.

    If you decide to go ahead and do that, I'd like to see before and after performance numbers for the server. Total wait time would be the one to look at.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • I'm going to try with and without a RAMDisk for TempDb, as you suggested, compare the two to be sure.

    But I'm more confused now than before about disk usage with temporary tables and table variables. Your MS link is for SQL2000, which I have read handled temp tables in RAM differently, so I'm not sure that I believe the answer to Q4 is still true. The other article you linked to suggests that SQL Server caching temporary tables in RAM (when available) has led to the 'legend' that they're stored in RAM instead of disk. What I can gather from various sources is that it seems like both temp tables and variables are always written to tempdb's data files, but are often cached in RAM as well when available, especially when smaller and more frequently accessed (which would still mean that fast storage for TempDb would be required, even when RAM is available). Which doesn't sound all that different from user tables to me.

    So this thread has taken a turn, but does anyone have anything recent and definitive that explains exactly where temp tables and table variables are stored (RAM or Disk), and in what scenarios does that depend? If it is stored in RAM when RAM is available, why would Dave Pinal's article I linked to earlier show TempDb pages written to disk?

  • SQL 2000 did handle table variables differently than prior versions of SQL Server. They didn't have them, it was a new feature. It hasn't changed, mechanically, since then.

    Here's some data on how table variables work: http://sqlinthewild.co.za/index.php/2010/10/12/a-trio-of-table-variables/

    Here's some more data on both: http://sqlinthewild.co.za/index.php/2007/12/19/temp-tables-and-table-variables/

    Bing/Google the subject, you'll find the data you need pretty readily.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (12/20/2012)


    Most of what accesses tempdb will already be perfectly happy to live in RAM till it runs out and has to use the disk. So the gains are likely to be minor, if they exist at all. Table variables, temp tables, and worktables, only spill out of RAM onto drives, if they have to because of lack of available RAM, per MS.

    I've look at Google and your links, and it still appears to me that what you said before isn't entirely correct - which is why I was hoping for some explicit clarification. The articles you just linked to are a perfect example of what I mean. They seem to contradict each other, as well as what you said about table variables and temp tables (quoted above). So far I'm still assuming that they do always write to disk - both temp tables and table variables because that's the bulk of what I've read as well as what I've experienced when I monitor my TempDB's MDF read/write usage. If RAM is available, they cache as per usual, but they will still always write to disk even if sufficient RAM is available.

    From the first link, "A Trio of Table Variables" http://sqlinthewild.co.za/index.php/2010/10/12/a-trio-of-table-variables/%5B/b%5D%5Bhr%5D

    This myth can be broken down into two parts:

    1. ...

    2. That table variables are not written to disk

    To answer myth #2, it looks like he creates a basic table variable example that is definitely not under memory pressure...

    20 Megabytes. The SQL instance I’m running this on is allowed to use up to 2 GB of memory. No way on earth is this table variable going to cause any form of memory pressure (and I promise there is nothing else running)

    He then kills SQL, and uses a hex editor to view the tempdb.mdf data file on the disk. He shows the data in the disk to prove that his table variable was written to disk even when there was plenty of RAM available. Then says this about myth #2...

    That pretty much speaks for itself. This myth, clearly false.

    What he seems to be showing here directly contradicts what is said in the second article that you linked to from the same site:

    From the second link, "Temp Table and Table Variables" http://sqlinthewild.co.za/index.php/2007/12/19/temp-tables-and-table-variables/%5B/b%5D%5Bhr%5D

    ... Are not persisted to disk unless there is memory pressure, or the table is too large to fit in the data cache

    So the first article shows that table variables are written to disk even when there is no memory pressure, while the second article claims in a summary of table variables that they are not persisted to disk unless there is memory pressure. I really must be misunderstanding something fundamental, like you said there's lots of information on Google/Bing, but most of what I find seems to be obviously contradictory.

    The only obvious difference in context between them that I see is that the second link claiming table variables "are not persisted to disk unless there is memory pressure" is over 5 years old, where the newer article shows that they are persisted to disk even when RAM is available. Does the newer article debunk the myth that they perpetuated in the older article?

    I thank you for posting those two links, these are perfect examples to work with, can you explain them to me and how they are both correct?

  • Honestly, if you want commentary on the two links, you might ask Gail directly. She (not he) is better positioned to explain her posts than I am.

    I'm not a Microsoft engineer, so I don't have "insider access" to the code that handles disk-spillover. (Neither is Gail, for that matter.) So what I have (and what she has) is data from what we can research online, plus our own experimentation.

    That's why I said I'd be interested in seeing your numbers comparing what you get from using a RAM disk vs the same load on a regular (HDD or SSD) mechanism. After all, direct observation is generally superior to reading about something anyway.

    I'm giving you the data I've got. That's all I can do in this kind of situation.

    Personally, in 12 years as a DBA, I've never yet found that optimization at that level has mattered to me. I've found that 99% or more of performance issues on the database servers I've dealt with, are in the way the data was architected (normalization, et al), and in the way the DAL is built (procs, inline code, et al).

    But that's my limited, annecdotal experience. I'd be willing to bet the guys who do EVE Online (last I heard, it runs on SQL Server) are MUCH more concerned about getting every microsecond of performance out of their databases, than I ever will be. They probably have to work at those levels of optimization.

    But I'm usually dealing with a few thousand transactions per second, mostly just feeding dynamic content to websites. Not that those don't need tuning, but not at the kind of level you're looking at with RAM disk vs SSD for tempdb.

    The kind of performance issues I have to deal with are things like:

    When we got a new engine for our websites, it was built by a third party company. In preparing to go live, we found out that the home page was going to take an average of 6 minutes to load, because of really, really, really poor database design. I spent a day re-architecting their database, a couple of days working with the web devs on regression testing to make sure I hadn't broken anything, and got the average load time to a few milliseconds. Query time went from 6 minutes to about 8 milliseconds, for the exact same data. That page gets a few hundred hits per second, from all over the world. 8 milliseconds is good enough for what we need. Could it be 7 instead? Probably, with the right amount of work done at the server level.

    We had an ETL process that, besides losing data, was also taking up to 12 hours to run through a few thousand records per day. I inherited this from my predecessor (who left this job to manage databases for a bank - which is a scary thing from my perspective). I fixed it, refactored it, and took the average time for the export process down to about 10 minutes. Again, I'm sure it could be refined down to 8 minutes if I spent the necessary hours to fine-tune the server environment.

    Prior employer, had a daily process that I was asked to find out why it was "failing" so often. Day one on the job, I found the "daily" process was taking anywhere up to 70 hours to run. Part of the problem is it was using a VB.NET construct to read XML files line by line, then using string functions to parse the data out of the XML, then staging each value into a Name-Value-Pair construct, then reconstituting the data into properly normalized tables. (Yes, that was "part of the problem". It had other issues on top of that.) I converted that part to a bulk import of the XML files, then XQuery to parse it directly to the normalized tables. Run time went down to something like 20 minutes just from that one refactor. That also got rid of some data-loss issues. Took me about 2 or 3 hours to find the problem and fix it. No server-level work needed at all. Could I have taken it down to 15 minutes instead of 20 by spending a few days optimizing the I/O channels for the text files? Probably.

    In all of these cases, and thousands more just like that, I've sped up data processes by simply huge margins, without having to worry about details at the level you're playing with. I'm interested in what you find, because I'm curious about that kind of thing, but I'm dubious about its value except possibly in some extreme edge cases (like, possibly, EVE).

    The extra time, probably measured in days or weeks, it would take to fine-tune each server in the manner you're experimenting with, just isn't worth the potential ROI for someone like me. That same time can be spent re-architecting, re-coding, etc., much more profitably.

    So, all I can do is give you what I've found on the subject. I think you'll need to experiment with it on your own to really find out what, if any, benefit you get from it. I am interested in what results you get, but mainly out of curiosity, not need.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • I really appreciate your time, and I'll try and remember to come back and post after we implement our new storage solution (may be a couple months). I commented on Gail's articles, hopefully she will respond. I did notice someone else posted about Q4 in that Microsoft article you mentioned, and Gail responded that she's proven that MS article to no longer be true (or never was true?).

    Thanks again for your time.

  • You're welcome.

    Hope I didn't come across heavy-handed or anything. Just trying to say I don't have definitive answers for you. Someone probably does, but just not me. Wish I did.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • kevin.kembel (12/21/2012)


    GSquared (12/20/2012)


    Table variables, temp tables, and worktables, only spill out of RAM onto drives, if they have to because of lack of available RAM, per MS.

    I'd heard temporary tables, unlike variables, were always stored in TempDb's data files and not in RAM - do you have any links or articles that explain this that I could read? From what I can find it looks like we're both wrong, it looks like both table variables and temporary tables are always stored in TempDb and not in memory.

    http://cybersql.blogspot.ca/2009/12/yes-table-variables-and-temp-tables.html

    We've noticed in our live environment that the most active database for disk i/o in our system is TempDb, it actually noses out our OLTP. This makes sense if the article I linked to is true, and temporary tables and table variables are always written to the TempDb data files. We didn't think this was because of a lack of RAM (currently 32GB for our 35GB OLTP database), but I guess we'll know when we move to the 64GB system and monitor TempDb usage there.

    I wrote a well-received article here on SSC a few years back that covers the differences: Comparing Table Variables to Temporary Tables[/url]

    As far as memory or tempdb, check out this from http://support.microsoft.com/default.aspx?scid=kb;en-us;305977&Product=sql2k :

    A table variable is not a memory-only structure. Because a table variable might hold more data than can fit in memory, it has to have a place on disk to store data. Table variables are created in the tempdb database similar to temporary tables. If memory is available, both table variables and temporary tables are created and processed while in memory (data cache).

    Summarize: if the data will fit in ram, it might not be written out to disk. But it always has to be able to be written out to disk, since the data might be greater than the available memory to hold the data. If it's written out to disk, it's written out to the tempdb database.

    Don't forget that if your queries end up creating worktables, those are in tempdb. Your high IO from tempdb could just be from poorly performing queries or execution plans that were generated based upon poor statistics (don't forget that table variables don't have any statistics, so if you're using them you just might see some pretty remarkable IO improvements by changing to temp tables).

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • I think Wayne's explanation goes pretty far to resolving the OP's confusion, but I think there's one other clarification about SQL Server's internal workings that might help.

    SQL Server creates and stores data in "pages" of about 8K each. Rows of data from tables, indexes, etc. are stored in these pages. SQL Server reads pages from disk into the data cache (RAM) (or creates them there) when required by the T-SQL statements being executed. SQL Server performs operations on the rows in those pages in the data cache and writes information about those operations to the transaction log. As a result, a page in the data cache may differ from the version of that same page on disk - a so-called "dirty" page. Every so often, "dirty" pages are "flushed" to disk, i.e., the changed pages in the data cache are written to disk.

    For both a temporary table and a table variable, SQL Server creates the object in tempdb, but this only requires the creation and allocation of the necessary pages in the data cache. As long as SQL Server has enough RAM available, SQL Server can hold the pages of the temp table/table variable in the data cache and does not have to (but apparently may) write them to disk. If the amount of data stored in the temp table/table variable exceeds the amount of RAM available to the data cache, the temp table/table variable "spills over" and SQL Server must write those pages to disk, where they will live until SQL Server reads them into the data cache again. Of course, local temporary tables and table variables are dropped when the creating process terminates, whether the pages exist on disk or in the data cache.

    So temporary tables and table variables always "use" tempdb in the sense that they are created there, but their data pages may exist only in the data cache. While Gail Shaw's article cited above demonstrates that temp table/table variable pages MAY be written to disk in the absence of memory pressure (the point she was trying to prove in rebutting the "table variables are memory-only" myth), it doesn't prove that they MUST be written to disk (which is not what I think Gail was trying to prove). Without knowledge of the underlying code, though, I can't say when or how SQL Server decides to write temp table/table variable pages to disk when there is no memory pressure.

    Jason Wolfkill

  • kevin.kembel (1/2/2013)


    I've look at Google and your links, and it still appears to me that what you said before isn't entirely correct - which is why I was hoping for some explicit clarification. The articles you just linked to are a perfect example of what I mean. They seem to contradict each other, as well as what you said about table variables and temp tables (quoted above). So far I'm still assuming that they do always write to disk - both temp tables and table variables because that's the bulk of what I've read as well as what I've experienced when I monitor my TempDB's MDF read/write usage. If RAM is available, they cache as per usual, but they will still always write to disk even if sufficient RAM is available.

    I think that the source of confusion here is teh difference between allocating space for some data on the disc and actually writing the data to the disc.

    Every page of table data, whether permanent table, temp table, or table variable, has to have space allocated for it on disc; however, it's possible that the data is never written to the space allocated for it. Most of the time, data is never written to the space allocated for table variables or for temp tables - that's what Gus is telling you; but all the time, space is allocated on disc so that the data could be written to disc if that became neccessary - that's what some of the references are telling you, but apparently it is so badly phrased as to appear to contradict what Gus told you.

    Tom

Viewing 15 posts - 1 through 15 (of 26 total)

You must be logged in to reply to this topic. Login to reply