Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12

Datawarehouse - My PO keeps asking for more tempdb Expand / Collapse
Author
Message
Posted Wednesday, November 21, 2012 12:21 PM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: 2 days ago @ 10:35 AM
Points: 866, Visits: 2,377
It sounds like you have to be very careful of how you rebuild those large tables - note that index rebuilds can take TempDB space in a range of 0% to 232% of the size of the index, online vs. offline, clustered vs. nonclustered, drop and create vs rebuild, and SORT_IN_TEMPDB options.
http://msdn.microsoft.com/en-us/library/cc966402.aspx.

In general, for a data warehouse, I'd be expecting large historical queries looking at weeks, months, or years of data - larger datasets need larger tempdb space, and only 40GB is something I'd consider small for even a 500GB database used for large historical queries... or for any database where any single index is more than 40GB/1.07 or so.

I agree with the above comments regarding parameter sniffing and stats sensitive queries, though I'll note that unless you can rewrite the queries/application to completely prevent, it's unpredictable when you'll have it - maybe some busy day your stats updates can't keep up, or some customer hits several parameter sniffing cases based on their requirements. You'd still need to have a larger tempdb for them to be able to operate at all.

I have to take issue with your 10GB tempdb limit. Why 10GB? Why not 8GB or 12GB? Is it 10GB on a 5GB database as well as a 1500GB database? Is it 10GB for OLTP as well as small historical queries as well as large historical queries? There's too much "it depends" in some of these situations for a set rule of thumb like that to sit well with me.

More importantly, why so stingy with your tempdb? For an OLTP database doing offline index work, that may be perfectly fine, but I would consider a tempdb data size 2.6% of the total data and 11.8% of the largest table's clustered index of it to be very small for data warehouse work.

P.S. Local SSD's in a RAID config (either SAS or PCIe) can not only easily top your 8Gb FC channels, but in a throughput limited situation, they can do so while allowing FC to carry their full throughput in other traffic, allowing for a still higher aggregate throughput.

P.P.S. Gail, I don't work with the OP, but I have worked with servers which have more than enough RAM (100GB+) to allocate 35GB to a single query and not bat an eye; they're built primarily for large historical queries, of which we usually have less than two running.
Post #1387578
Posted Wednesday, November 21, 2012 12:51 PM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Today @ 3:00 AM
Points: 42,774, Visits: 35,871
Nadrek (11/21/2012)
P.P.S. Gail, I don't work with the OP, but I have worked with servers which have more than enough RAM (100GB+) to allocate 35GB to a single query and not bat an eye; they're built primarily for large historical queries, of which we usually have less than two running.


Not really...

That's 35GB over and above the plan cache and the data cache. So for a server with 100GB of memory to allocate 35GB to a single query requires that there is more than 35GB unused, not allocated to the data cache or plan cache. That's what doesn't happen.

I've worked with servers with 128GB memory or more. A 35 GB sort will still likely spill to TempDB because while there's more than 35GB of memory allocated to SQL, SQL is not keen on throwing 35GB out of the data or plan cache to run a single query.

35GB of workspace memory for a single query. Not 35GB of buffer pool space (my file server in the lounge has almost that)



Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #1387592
Posted Wednesday, November 21, 2012 8:50 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 3:51 PM
Points: 36,959, Visits: 31,469
Brian-444611 (11/15/2012)
you kinda sound like my DBEs :)


but no seriously this advanced search thing has an endless number of parameters and generates 100+ line queries that are just ugly. it's like "lets do a 30 table view" situation


I'm with you. Queries shouldn't be so sensitive to stats and they shouldn't be "all-in-one" queries that join 30 tables. The use of multiple smaller queries that hold much smaller result sets (maybe even just one depending on the data) in thoughtfully created Temp Tables can greatly reduce resource requirements and cause performance to skyrocket.

Take a look at the execution plan for one of those queries. If you see huge rows with counts much larger than any of the tables, you probably have a query with some many-to-many joins that could well benefit from "Divide'n'Conquer". A dead giveaway will be if they used DISTINCT (or possibly, GROUP BY) to overcome the problem of duplicated rows in the result set (although you could still have the problem without this particular smell).

Then, rewrite the query by dividing it up and see if it doesn't perform better while using less of Temp DB in the process. If you can pull it off, use it as a teaching tool to get the devs to write code a bit more thoughtfully. Speaking of "thoughfully", don't scream at the devs about it. They're doing the best they can with what they know. Be a mentor about it teach them a better way. If you can't make the code any better, give them an additional 40GB of Temp DB because there's no room to talk unless you can set the example.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1387673
« Prev Topic | Next Topic »

Add to briefcase ««12

Permissions Expand / Collapse