SQL Server 2008 R2 TEMPDB massive autogrowth suddenly.

Question

SQL Server 2008 R2 TEMPDB massive autogrowth suddenly.

PretendDBA

SSCrazy

Points: 2060
More actions
January 6, 2013 at 9:46 pm

#268915

Today while i was at lunch, TEMPDB had massive autogrows until it filled the disk it sits on. When i returned from lunch there were no complaints from users of any issues just the email alerts from SQL,
DESCRIPTION:Could not allocate space for object 'dbo.SORT temporary run storage: 143231156879360' in database 'tempdb' because the 'PRIMARY' filegroup is full. Create disk space by deleting unneeded files, dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup.
AND
DESCRIPTION:k:\SQLDATA\tempdb.mdf: Operating system error 112(failed to retrieve text for this error. Reason: 15105) encountered.
My first knee jerk reaction was to grow the disk which i did but then when i ran the disk usage report i saw TEMPDB was empty.
There is only one user DB host on this server which is a DB supplied by a company for a program we use. But i just want to find out what caused the issue so i can try to stop it happening again. I have logged a case with the vendor but is there anything i can check. I would of expected if TEMPDB filled the server would fall over. Or is it a matter of TEMPDB filled whatever was trying to run stopped and SQL server emptied the TEMPDB file again?

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply

Joie Andrew One Orange Chip Points: 27360 More actions · Answer 1

Did you ask the sysadmins of the application if they were running any jobs/queries/reports during that timeframe? Is there a log for the application that will let you know what it was doing during that time?

Joie Andrew
"Since 1982"

VastSQL SSC-Dedicated Points: 36067 More actions · Answer 2

VastSQL

SSC-Dedicated

Points: 36067

January 7, 2013 at 1:25 am

#1574193

Anything from SQL error log??

Grant Fritchey SSC Guru Points: 398679 More actions · Answer 3

Sounds like whatever transaction was running stopped and the tempdb emptied back out. Normal, if unfortunate, behavior. I'd try to track down where that query came from because, if your business people are anything like the ones I've supported, it'll be back.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

PretendDBA SSCrazy Points: 2060 More actions · Answer 4

Yep I've asked the app admin and report people everyone is claiming they were doing nothing.

All the users access the DB via the application front end. So general users shouldn't(wouldn't) be running queries directly against the DB the only exception is the report writers, they have a report DB restored from the nightly backup but sometimes they want full live data.

Most likely it was an glitch in the application i guess. some query the app executed got stuck and filled up tempdb?

Here are the errors in SQL error log.

--------------------------------------------------------------------------------------

Date7/01/2013 12:57:40 PM

LogSQL Server (Current - 7/01/2013 1:00:00 PM)

Sourcespid2866

Message

Error: 17053, Severity: 16, State: 1.

----------------------------------------------------------------------------------------

---------------------------------------------------------------------------------

Date7/01/2013 12:57:40 PM

LogSQL Server (Current - 7/01/2013 1:00:00 PM)

Sourcespid2866

Message

k:\SQLDATA\tempdb.mdf: Operating system error 112(failed to retrieve text for this error. Reason: 15105) encountered.

-----------------------------------------------------------------------------------

-------------------------------------------------------------------------------------

Date7/01/2013 12:57:41 PM

LogSQL Server (Current - 7/01/2013 1:00:00 PM)

Sourcespid2866

Message

Error: 1105, Severity: 17, State: 2.

--------------------------------------------------------------------------------------

-------------------------------------------------------------------------------------

Date7/01/2013 12:57:41 PM

LogSQL Server (Current - 7/01/2013 1:00:00 PM)

Sourcespid2866

Message

Could not allocate space for object 'dbo.SORT temporary run storage: 143231156879360' in database 'tempdb' because the 'PRIMARY' filegroup is full. Create disk space by deleting unneeded files, dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup.

------------------------------------------------------------------------------------------

Orlando Colamatteo SSC Guru Points: 182276 More actions · Answer 5

The reference to dbo.SORT in the error message says to me that the growth was due to an internal work table that spilled to tempdb. Some reasons why this could have occurred:

- It could have spilled due to memory pressure that is not always present.

- It could happen if an ad hoc query returned more rows than the query author expected (cartesian product maybe) and the execution plan had a Sort operator within it.

- It could happen to a query that typically runs normally if your stats are outdated and the engine picked a bad plan with a Sort in it that it may not have picked had your stats been up to date.

There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato

PretendDBA SSCrazy Points: 2060 More actions · Answer 6

Thanks for the tips.

Memory pressure,

The server has 32GB of RAM, 28GB is assigned to SQL server as MAX memory. TempDB is presized to 10GB (app vendor best prac). The user DB that is hosted on the server is 48GB, the tempdb grew in about 3 minutes from 10GB to 20GB filling the disk. It would have to be an aweful lot of memory pressure to make SQL release 20 odd GB of memory to tempdb wouldn't it? Yes this server is excluive to SQL so SQL should only be competeing with Windows services for resources.

As for stats, the vendor gave us some update stats jobs that run every morning before the core userbase arrives. there is also a job that runs every hour that updates stats on some heavy hit tables. indexs are also rebuilt nightly.

All Jobs are from app vendor.

So what does this leave as likely outcomes?

PretendDBA SSCrazy Points: 2060 More actions · Answer 7

Could your second point be caused by a "hero report writer" :)?? the server has been humming alot fine for almost 3 years then this suddenly yesterday.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 8

davidwarner (1/7/2013)
So what does this leave as likely outcomes?

Someone ran a query that pulled a lot of data and put an order by on it (or something that required a sort). SQL attempted to sort, split the sort to TempDB (sorts are very prone to spilling because of the sheer volume of workspace memory they need), the sort filled TempDB, caused the query to fail.

You'd need to identify who was running what. Unless there was some auditing or tracing already in place, SQL won't have a record.

I'd speak to anyone that can run custom queries, see what they were doing.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Joie Andrew One Orange Chip Points: 27360 More actions · Answer 9

You might be able to check recent expensive queries to try and identify the query that caused your tempdb issue. You can try seeing if Activity Monitor in SSMS will show you the query. Grant also had a really good SQLPASS web seminar that used DMVs to get info on long running queries. I believe the code he used was:

--Long running queries

select top 10 *

from sys.dm_exec_query_stats

cross apply sys.dm_exec_sql_text(sql_handle)

cross apply sys.dm_exec_query_plan(plan_handle)

order by total_elapsed_time/execution_count desc

Of couse these solutions are only good if SQL has not been restarted since the TempDB errors, as the DMVs will only collect metrics from the time SQL was started.

Joie Andrew
"Since 1982"

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 10

Joie Andrew (1/7/2013)
Of couse these solutions are only good if SQL has not been restarted since the TempDB errors, as the DMVs will only collect metrics from the time SQL was started.

Those aren't since SQL started. They're queries whose plans are still in cache.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Joie Andrew One Orange Chip Points: 27360 More actions · Answer 11

Ah, great point. I did not realize that. That is an important difference. Thanks for pointing it out Gail!

Joie Andrew
"Since 1982"

PretendDBA SSCrazy Points: 2060 More actions · Answer 12

I've sent an email to all our report writers, reports are written in Crystal. so i'd expect the queries are hardly optimised. problem is becuase it's the start of the year big reports are being run though they "should" be run on the dedicated report server. Some of these guys think they know better.

There are also built in reports in the application that some have been modified by the report team. these can be run by anyone with required security rights in the application all it asks for is a date range. These are all run off the live DB as well. I've asked the report writers to check any reports they might have changed in the last few weeks.

I feel it must be a report as if it was an application glitch we should have had it before. unless it was just one of those things that need a perfect situation to occur.

The server has not been rebooted since it happened.

Richard Fryar Hall of Fame Points: 3457 More actions · Answer 13

The default trace records which login is responsible for file autogrowth. This sometime shows as sa, for example during some system operations, but it is worth a try.

Here's the script I use:

use master;

set nocount on;

declare @path sql_variant

select @path = value

from ::fn_trace_getinfo(0) i

join sys.traces t on t.id = i.traceid

where t.is_default = 1 and i.property = 2

select df.StartTime,

e.name,

df.DatabaseName,

case when df.ApplicationName like 'SQLAgent%JobStep%'

then isnull((select name from msdb..sysjobs where job_id = CONVERT(uniqueidentifier, convert(varbinary(16), substring(df.ApplicationName, charindex('(Job', df.ApplicationName) + 5, 34), 1))), df.ApplicationName)

else df.ApplicationName end AppName,

df.LoginName,

df.SPID,

df.Duration,

df.EndTime,

df.FileName,

df.TextData,

df.IntegerData

from ::fn_trace_gettable(convert(nvarchar(255), @path), 0) df

join sys.trace_events e ON df.EventClass = e.trace_event_id

where e.category_id = 2

order by df.StartTime desc

If the default trace file has rolled over, modify the first bit to look at an older file.

Check Your SQL Servers Quickly and Easily
www.sqlcopilot.com

Orlando Colamatteo SSC Guru Points: 182276 More actions · Answer 14

To view all default trace files:

Reviewing AutoGrow events from the default trace by Aaron Bertrand

There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato