Alerting when tempdb files have X% free space

Question

Alerting when tempdb files have X% free space

Jay@Work

SSCrazy

Points: 2631
More actions
August 13, 2014 at 8:32 pm

#298098

I have a tempdb split into 4 files (5 if you include the log).
Autogrowth is disabled on the mdf/ndf files so that they can be used round robin (1 file per logical CPU).
Is there a way to be alerted when there is x% of free space left?
I know hwo to check the free space via t-sql but want to be able to be alerted. I could run a sql job that reports the free space and send a database mail message if under x% but wondered if there was a built in (or better) method?
I also have SQL Sentry.
SQL 2012 Standard

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

Steve Jones - SSC Editor SSC Guru Points: 734469 More actions · Answer 1

You can use alerts (http://www.sqlservercentral.com/articles/alerts/89885/), though I'm not sure these are better than a job with a query in it. I think they'll both function similarly.

I am not sure if your SQL Sentry product (PErformance Advisor?) does these alerts, but if I'm using that already, I'd centralize the alerts there. I know Red Gate's SQL Monitor can easily do this.

Disclosure: I work for Red Gate.

SQLRNNR SSC Guru Points: 281334 More actions · Answer 2

iirc there is a custom metric in SQL Sentry that can be applied for this kind of alert. Then it is just a matter of configuring the thresholds and notifications. I would just do it from SQL Sentry so the alerts are centralized.

Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events

Jeff Moden SSC Guru Points: 1003870 More actions · Answer 3

Jay@Work (8/13/2014)
I have a tempdb split into 4 files (5 if you include the log).
Autogrowth is disabled on the mdf/ndf files so that they can be used round robin (1 file per logical CPU).
Is there a way to be alerted when there is x% of free space left?
I know hwo to check the free space via t-sql but want to be able to be alerted. I could run a sql job that reports the free space and send a database mail message if under x% but wondered if there was a built in (or better) method?
I also have SQL Sentry.
SQL 2012 Standard

I can't speak of SQL Server 2012 Standard but my system is using TempDB files in a "round robin" fashion and I've not had to disable authogrowth to get that to happen.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jay@Work SSCrazy Points: 2631 More actions · Answer 4

Thanks Jeff.

Have your tempDB files ever had to autogrow or were they created with an initial size large enough that this hasn't been necessary?

Our Sharepoint consultants advised the 1 tempdb file per processor approach and also specified autogrowth be disabled for this to work.

While googling the topic (never split tempDB before) I came across a couple of places that seem to back this up. Just found this one.

http://www.sqlconsulting.com/news1308.htm

SQLRNNR SSC Guru Points: 281334 More actions · Answer 5

Jay@Work (8/14/2014)
Thanks Jeff.
Have your tempDB files ever had to autogrow or were they created with an initial size large enough that this hasn't been necessary?
Our Sharepoint consultants advised the 1 tempdb file per processor approach and also specified autogrowth be disabled for this to work.
While googling the topic (never split tempDB before) I came across a couple of places that seem to back this up. Just found this one.
http://www.sqlconsulting.com/news1308.htm

Round robin is the default behavior for these files.

The reason to disable autogrow partly comes back to managing your database and the beliefs of some as it being a best practice.

Part of the reason is to try and prevent an autogrow to ensure the files will remain the same size and "maintain" a balanced file use.

Here is a better resource on the number of files to create. It's not so much one file per core, but better aligned with sgam needs.

https://www.simple-talk.com/sql/database-administration/optimizing-tempdb-configuration-with-sql-server-2012-extended-events/

Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events

SoHelpMeCodd SSCertifiable Points: 5871 More actions · Answer 6

The size of tempdb depends upon uses of it. As a simple example, if a system submits T-SQL that references a temporary table, then one copy of that temporary table will be created per user concurrently running that T-SQL. There is no common or typical size for tempdb, and it will grow as needed (especially when a system is brought into production) and should find a stable state - at which point that is the size it should be.

The URL's T-SQL clause:

WHERE wait_type LIKE 'PAGELATCH_%'

AND (resource_description LIKE '2:%:1'

OR resource_description LIKE '2:%:2'

OR resource_description LIKE '2:%:3')

is incomplete, as explained by SQL Server Storage Engine Program Manager Sunil Agarwal at http://blogs.msdn.com/b/sqlserverstorageengine/archive/2009/01/11/tempdb-monitoring-and-troubleshooting-allocation-bottleneck.aspx, a "PFS page will appear every 8088 pages in a file. The GAM and SGAM will appear every 511232 pages " (not just pages 1, 2 and 3. A less technically challenging (SharePoint) example is at http://blogs.msdn.com/b/sqlgardner/archive/2011/12/05/identifying-contention-in-tempdb.aspx, which reveals tempdb contention that was resolved by altering SharePoint's "inheritance" (not by throwing more tempdb files at the problem), and presents a screen shot with page numbers that are far from pages 1, 2 and 3.

Some fairly solid advice is again offered by Sunil, at http://blogs.msdn.com/b/sqlserverstorageengine/archive/2012/07/20/9274319.aspx?PageIndex=2. Another good source of information is Microsoft's SQL Server Principal Escalation Engineer Bob Dorr, at http://blogs.msdn.com/b/psssql/archive/2009/06/04/sql-server-tempdb-number-of-files-the-raw-truth.aspx (note that Bob states "One file per CPU that SQL Server sees", not "one file per processor") and http://blogs.msdn.com/b/psssql/archive/2008/12/17/sql-server-2005-and-2008-trace-flag-1118-t1118-usage.aspx. Plus, the former SQL Server DBCC Program Manager Paul Randall offers some solid advice on multiple pages on his site (SQLSkills.com), and I see Jason beat me to a referral of SQLSkills contributor and SQL Server MVP Jonathan Keyahias' recent excellent article.

If you want to see a non-production system go down hard and quickly, turn off tempdb autogrowth, set its file sizes to the minimum allowed, and finally let ostress.exe run hundreds of queries that harness tempdb. Once the 1105 has been raised on tempdb (in SQL Server's errorlog), you should find most of SSMS to be unusable (much of its T-SQL harnesses tempdb) and you should find a great many system stored procedures that fail to run (they too can use tempdb). I suggest you trust but verify what most SharePoint consultants suggest 🙂

Jay@Work SSCrazy Points: 2631 More actions · Answer 7

Thanks. As an "accidental DBA" I find a lot of the articles on SQL internals a tad heavy.

For every article I read that talks of split tempDBs being a good thing I find another warning of it's use. Including Paul Randall's myth debunking ones.

I've actually emailed Paul on this very topic as when it comes to SQL knowledge I'd believe him if he told me that putting my head in the oven could improve query performance 🙂

What I am getting mixed messages on is the ability for the "round robin" method to work if autogrowth is turned off.

I *never* turn off autogrowth on any DB hence me running to the forums to check the validity of the claim.

At present this (yet to be put into production server) has 4 database files all with an initial size of 5GB and autogrowth disabled on each.

How long is a piece of string... I know,

This is a sharepoint farm of approx 250 users that use doc libraries for storing and retrieving docs - so nothing overly stressful for the server(s)

TempDB is on a dedicated disk - 100GB.

If disabling autgrowth is NOT a requirement of multiple tempDB files then I won't disable it and therefore won't run the risk of the system coming to a halt.

If disabling is the better option then I will disable and set up alerts to give me plenty of time to plan a stop/start of the sql service in order to rebuild tempdb

SQLRNNR SSC Guru Points: 281334 More actions · Answer 8

Disabling autogrowth is not a requirement.

Size up your files in advance. Round Robin will work.

Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events

Jeff Moden SSC Guru Points: 1003870 More actions · Answer 9

Jay@Work (8/14/2014)
Thanks Jeff.
Have your tempDB files ever had to autogrow or were they created with an initial size large enough that this hasn't been necessary?
Our Sharepoint consultants advised the 1 tempdb file per processor approach and also specified autogrowth be disabled for this to work.
While googling the topic (never split tempDB before) I came across a couple of places that seem to back this up. Just found this one.
http://www.sqlconsulting.com/news1308.htm

As I do with a single file TempDB, I attempt to make the files large enough to handle the loads based on the historical size of TempDB with a decent buffer for the future so that they don't grow under normal circumstances. Growth of TempDB is normally a planned event for me. They work just fine in a round robin fashion (even a far back as SQL Server 2005, which is what we're currently stuck with at work) without disabling growth. This isn't just an opinion on my part. I've watched the file usage during heavy usage because I wanted to be sure.

The reason why I don't disable auto-growth is because if someone runs a bad query, it will quickly consume the 16GB of TempDB that I currently have allocated and the whole server and everyone using it would suffer immediately. 16 GB is nothing for an accidental and unchecked many-to-many join to consume. The other thing is, I want some time to capture the query in progress and the machine it is being run from so that I can find and educate the person that executed the query (they've usually put some ad hoc code together without the necessary understanding of T-SQL, the design of the tables, or any regard that they're running something on a production box). If I were to cap the size at only 16GB, such a query would quickly crash the server leaving me virtually no time to investigate and correct the cause. I've seen such a query consume 500GB in just 15 minutes. 15 minutes *16GB / 500GB would only allow me 0.48 minutes to find the source of the problem. That's only 28.8 seconds. Even with an alert that fires off a proc to find and capture the required information when temp db reaches 90% of one of the files, the system might crash even before the proc could finish saving the info. Even if the proc was able to determine the cause before a crash, system crashes cause me personal serious embarrassment and also cost the company huge amounts of money as well as getting a serious black-eye (bad word-of-mouth travels faster than the wind) in the eyes of the customers currently logged in. I'd much rather kill a SPID than to have the SPID kill me or the company that I work for. 😀

Why not put TempDB on its very own Terabyte drive and allocate it all to TempDB? You could do that and set the alarm to 90% full and that'll let even some of the most ridiculous of queries run without crashing the system... but that'll only give you 100GB of warning and, like I said, 500GB in 15 minutes will give you only about 3 minutes to react for 100GB of free space left. Maybe less because you no longer have pauses for growth no matter how small such pauses may be.

I also failed to mention that "the most ridiculous of queries" aren't the worst queries that will ever happen. :hehe: I stopped that one runaway query at 500 GB. The estimated execution plan that I generated on the query, post-mortem, allowed me to estimate (after MUCH work but had to prove it because my head was on the line for stopping the year-end report the GM wanted "right now") that it would have tried to consume nearly 2TB (wouldn't have fit on the 1TB drive, guaranteed crash) if I hadn't stopped it. As a bit of a sidebar, once I fixed the query with some Divide'n'Conquer methods, it ran in 8 seconds flat and barely touched TempDB. I might not have had the opportunity to even see the query had I capped the growth of TempDB. It's also why I'm absolutely paranoid about any queries that use DISTINCT. 😛

IMHO, it doesn't matter whether you have one file or multiple files. If you have such a runaway query, you're going to have some work to do to get things back to normal, no matter what. If you haven't determined what the problem was, it will happen again in the very near future when the same person tries to run it again 15 minutes from now. 😉 Let TempDB grow so you have time to discover the cause of such problems, to cancel the query without a crash, and to find the user before they can run it again. I have a 16GB TempDB sitting on a dedicated Terabyte drive to allow me to do just that when a similar problem next occurs... and it WILL occur again. It's just the nature of the beast.

As yet another sidebar, the reason why I don't have TempDB on multiple drives (one for each file) to take advantage of the fact that each spindle has its own set of Read/Write heads to get some true physical parallelism into TempDB, is because of the incredible size of the SAN drives today (freakin' ridiculous IMHO). I also won't put something else on a TempDB drive (it does only have one set of R/W heads) and it's bad enough that I've dedicated a Terabyte drive to a 16GB TempDB to begin with. I just can't see dedicating 4 or 8 drives of such size to TempDB for our particular case.

As still one more sidebar, unless you're running something like Wall Street and have a bank of MPP appliances at your beck and call, you might want to start fixing the cruddy queries that folks wrote if you have a lot of TempDB usage. Yeah, putting TempDB on SSDs will give you something like 3 to 10X speed improvements but those improvements pale in comparison to the 60 to 6000X improvement that you can, many times, get by fixing crap code. :Whistling:

Heh...jeez... look at the length of that post. My apologies. I didn't mean to go on so, but I thought it important that you know what I've been through and why I make the recommendations that I do, white papers be damned in some cases. :w00t:

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1003870 More actions · Answer 10

Jay@Work (8/14/2014)
I've actually emailed Paul on this very topic as when it comes to SQL knowledge I'd believe him if he told me that putting my head in the oven could improve query performance 🙂

That would be horribly incorrect advice from Paul. As an accidental DBA, it's your job to find the person that wrote the performance challenged query and put THEIR head in the oven... right next to the pork chops. 😀

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jay@Work SSCrazy Points: 2631 More actions · Answer 11

Yep, I have a dedicated disk for TempDB and only TempDB

This is a Sharepoint (SQL) server so the activity is "sort of" predetermined.

No other apps/DBs on this server.

I heard back from Paul and he confirmed autogrowth is fine... not too small and in set sizes not percentages.

He also recommended enabling traceflag 1118. This also makes sense as user objects are rarely created/dropped in a sharepoint environment.

Now, so long as my sharepoint developments don't go rogue I should be fine 😀