SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
Search:  
 
 

SQL Musings

Add to Technorati Favorites Add to Google
Author Bio
Steve Jones Editor at SQLServerCentral.com You can follow Steve on Twitter as way0utwest (www.twitter.com/way0utwest)
Browse by Tag : performance (RSS)

Separating Out Indexes

By Steve Jones in SQL Musings | 10-15-2009 7:51 AM | Categories: Filed under: , ,
Rating: (not yet rated) Rate this |  Discuss | 1,160 Reads | 558 Reads in Last 30 Days |6 comment(s)

I saw a post recently where someone was asking how to separate out all indexes from the data into a separate NDF file. This was the same post that I wrote about recently with the thread myth.

The poster seemed confused on a couple points. The first was that he or she thought that they could separate out the clustered and non-clustered indexes from the data.  That’s alone makes me think that this user is not advanced enough to work with multiple filegroups if they don’t understand the table v clustered index structure.

The second thing was thinking a separate file improves performance if the indexes are moved. This can improve performance, but a couple things need to take place.

  1. The indexes need to be heavily used at the same time as other data. You are trying to separate out items that are heavily used concurrently. If I need 1,000,000 rows from a structure (table or index) now and another 5,000,000 rows when that operation completes, I’m not helping things if I separate these items. If two users are actually accessing this data at the same time, or one user with a join, then separate files can help.
  2. You need separate physical disks. A lot of people don’t think this through properly. It doesn’t mean two volumes or drive letters, and it doesn’t mean two LUNs on a SAN. It means separate physical storage devices. They can be HDDs, SSDs, or something else, but they need to be separate physical devices.

The user admitted this was something they heard, and would likely not separate things out. I think that’s a good move and should save them some headaches.


SQL Server Legend – Data Files and Threads

By Steve Jones in SQL Musings | 10-13-2009 8:38 AM | Categories: Filed under: , , ,
Rating: |  Discuss | 2,447 Reads | 979 Reads in Last 30 Days |11 comment(s)

Someone was asking about using multiple data files recently to try and increase performance. I had answered that unless you had separate physical disks that it wouldn’t matter.

However then I remembered hearing something about threads and files for I/O. I tjhought this was a myth, but I wasn’t sure. I searched around, and then pinged Gail Shaw since I know she does a lot of internals type investigation.

She confirmed this is a myth and sent me this reference: SQL Server Urban Legends Discussed. It’s from the Microsoft Customer Service engineers and discusses the origin of the myth and how things work.

The bottom line is that SQL Server uses a thread for each unique disk drive, not files. In SQL 2000, you can fool the system if the disk drives are volumes on the same physical disk. However in SQL 2005 the system checks to see if these are the same physical disk.

There are some caveats, but if you hear this rumor, dispel it. Let people know that multiple file groups (or files) only help if you have different physical I/O paths and drives.


SSC Site Issues

Rating: (not yet rated) Rate this |  Discuss | 2,035 Reads | 232 Reads in Last 30 Days |5 comment(s)

It was kind of amazing to see SQLServerCentral have issues over the last few days. We’re still not sure what happened, but the load went crazy over the last few days, substantially slowing email sends and causing performance problems.

There are definitely some database, issues, and we are looking to provision a larger database server in the short term. That’s not a great fix, but we’re a business like any other and hardware is often quicker and easier to deal with than code in the very short term. As we dug in, we realized the same database server we’ve had for over 2 years, with more than double the load, just can’t keep up. I’m not sure if anything else has changed code-wise recently, but the quick fix is a new server. Since we have a hosted server with Rackspace, this is actually a fairly easy and relatively inexpensive fix.

Beyond that, we have some code to examine. As with most businesses, we had developers built the site, without a real DBA working alongside them. It wasn’t a great solution, but it also is the realization of how resources sometimes get deployed. I think our developers did some neat things in one sense, not so good in another. There is a boatload of nHibernate code in there, and some of those pages appear to be amazing POS constructs. If nothing else, I think I would be terrified of ever using nHibernate in any project just based on my experiences here.

As things seemed to fail today, we made the decision to move the forum database to our backup server. We’ve kept a second one handy for DR purposes, and since the code is fairly separate, it seemed like an easy fix to reduce the load.

Brad McGehee expressed concerns, which seem to be well founded. I’m having issues working in the forums, so I think there are still things broken.

In one sense this is the opposite problem that we had for years. Early on, Andy, Brian, and I coded quite a few things on the site, often in a half-a**ed way, since we weren’t programmers. We did have lots of RI and normalized data, but the front end code wasn’t anything to showcase. Now it seems we’re in the reverse situation. There is a lot of nice front end code, well documented and structured, but the database is a bit of a mess.

Where we go from here I’m not sure, but I expect that the next couple of weeks will involve some root cause analysis as well as some refactoring of code. I’d prefer to throw out the nHibernate design and build a simpler structure that is easier to tune, as well as maintain, but I’m not sure that will happen.

However I will try to continue to update you with information about how we will proceed.


The Windows 7 RC Download Issue

By Steve Jones in SQL Musings | 05-13-2009 9:56 AM | Categories: Filed under: ,
Rating: (not yet rated) Rate this |  Discuss | 1,607 Reads | 124 Reads in Last 30 Days |1 comment(s)

A better explanation from the SQLCAT team in this blog entry. I’d still like to see more, but this makes sense.

Also an update from Paul Randal of SQL Skills on his speculation of the issue.


Until You Know Better…

By Steve Jones in SQL Musings | 03-30-2009 5:48 AM | Categories: Filed under: , , ,
Rating: (not yet rated) Rate this |  Discuss | 3,521 Reads | 386 Reads in Last 30 Days |no comments

Stick with the defaults. In fact, as Andy Warren has told me many times, unless you know better and have a reason to not do so, always go with the defaults. I think that’s good advice, and it’s why I think that defaults in applications, especially SQL Server, should be set for the values that are appropriate 80% of the time.

I saw a post on memory management that got me thinking. Someone was asking if settings have changed in 2008 and I needed to look around a bit to see if that was the case. I hadn’t seen much on this lately, and I need to write a short piece on that.

However in looking around, I stumbled across this article from Technet on How to determine proper SQL Server configuration settings. As I read through it, there was a nice trend:

In actual practice, if you change the affinity mask setting from the default it only rarely helps performance, and will frequently degrade performance.

Based on actual production experience, you do not need to use Fiber mode except in very rare circumstances.

In SQL Server 2000 or in SQL Server 2005, you cannot change the max async IO configuration setting. SQL Server 2000 or SQL Server 2005 automatically tunes this setting.

If you configure a number of worker threads to a value that is greater than the default, it is almost always counterproductive and slows performance because of scheduling and resource overhead.

These quotes from the various sections of the article seem to show that you don’t want to mess with the tuning of these “knobs” very often in SQL Server. Near the beginning of this article has this quote:

If you do make a configuration change, you must perform rigorous methodical performance testing both before and after the change to assess the degree of improvement.

This sums things up nicely. Rigorous and methodical testing should be performed or you’ll likely be making things worse and potentially not realize it at that time. For the majority of people, you should stick with the defaults unless you know better.


MySpace

By Steve Jones in SQL Musings | 02-18-2009 5:29 AM | Categories: Filed under: , ,
Rating: (not yet rated) Rate this |  Discuss | 2,906 Reads | 147 Reads in Last 30 Days |1 comment(s)

MySpace has been the Premier SQL Server site, and probably the largest SQL Server site, on the Internet. It’s a case study that is fascinating, and they’ve been surprisingly candid at times about how their systems are built.

Recently someone posted a link to a presentation from their lead DBA on how they’ve architected their 150 database servers. It’s over an hour long, but it’s very interesting.

I’ve seen some of this before, some of it in articles, some of it in presentations, but it’s always fascinating to hear how they moved through architectures as they ran into extreme scaling issues. The evolution of their architecture makes sense, and I think they have done a good job growing over time, despite the various pain points.

Thinking scalability is something you always want to consider, even though most of us will never hit those levels where we need thousands of spindles. A few of us will, but most of us won’t. Therefore while you want to think scale out early, you don’t want to get too wrapped up in it and don’t want to build too much of it early on. Scaling out is hard, takes administrative work, and slows development.

So how do you do it?

I’ve never had a huge system, but we have always talked scale as we moved along. Mostly we identified the places where scale would impact us and tried to be aware of where we could start to re-architect things if we want into scaling issues. In one financial company we worried about growth, and we were having issues, but upgrading from SQL 65 to 2000 and then going from a 2way to an 8 way box masked problems, and that was the best way to handle things.

However along the way we also identified that pricing was an issue for us. A constant stream of quotes for securities overwhelmed the system, so we planned for that to exist separately, used a separate connection in the app to get pricing, and then we could move that to a new database, instance, or server if needed. We never grew that far, but we thought about it and architected small changes in there.

This is a great video to watch, but it’s long, an hour presentation, and you have to pay some attention to it. The big takeaway? Scaling out requires most of the logic to live in the application, not the database. Not that you can’t have some logic in the database, but you need to be sure that you aren’t depending on a particular database, and you replicate that logic to all instances of the db.


SQL Quiz - Mistakes You've Made?

By Steve Jones in SQL Musings | 11-11-2008 9:29 AM | Categories: Filed under: , ,
Rating: (not yet rated) Rate this |  Discuss | 2,828 Reads | 119 Reads in Last 30 Days |no comments

I'd seen Jason Massie's post on Facebook of this SQL Quiz, but when I found Brian Kelley's post, I decided I needed to jump in there.

Mistake #1 - Downtime

Definitely my fault, and I should have known better. I had come to Denver to work at a small company, and a relatively small database. This was in 1999, and we had a 20 or 30GB database, a decent size at the time, but not huge. However we had performance issues, and we had fires to fight every day.

We had a few very large tables that were used for many joins and written to constantly. On SQL Server 6.5, there was a limited amount of things we could do to improve things, especially in the short term, and about half my time was patching things. The other half was slowly improving things. I thought I was getting a handle on things, which is probably the wrong attitude. I should have been looking to see what I didn't know.

One of the issues was archiving old data. I finally convinced someone that we could archive some data to another database and use a view to join the current table with the archive table when we needed to. My plan was to archive the data over and get it done quickly. I decided over lunch I could move some number of hundreds of thousands of rows from one database to another. We had fairly fast servers, and based on testing, I figured it would be 20-30 minutes tops.

I should have tested it out in the QA environment. They were busy, but I should have kicked them off and really stressed the system. I had wrapped an insert and a subsequent delete in a transaction to ensure things moved. They did, but I found the delete was a problem. It ran slowly, which didn't make logical sense, but deleting 100k rows > 10 * (delete 10k rows). Much, much larger, by a factor of 10 at least. 

Needless to say I had things locked so I couldn't check progress. When people came back from lunch (20 minutes past my deadline), I kept telling them it would be any minute. Eventually the disks filled with logs, things rolled back, and we barely got the system back 90 minutes later. All of our customer people had to scramble to get work done by the end of the day. I didn't make many friends that day.

If I'd tested it at anything close to scale I would have realized that there was an issue and batched things up in groups of 10k, or even 5k, rows.

 

 


One File Per Core?

By Steve Jones in SQL Musings | 10-10-2008 4:30 PM | Categories: Filed under: ,
Rating: (not yet rated) Rate this |  Discuss | 2,254 Reads | 110 Reads in Last 30 Days |no comments

For a long time in my career I heard various advice on how many files you should create for a database. Since SQL Server 7, when it we had filegroups and multiple files, I've heard various pieces of advice, and honestly I'm not sure what the truth is and what is urban legend.  Over the years I've heard

  • One file per disk array
  • Use multiple files for performane since each gets its own thread
  • One file per CPU
  • One file per core.

I've seen a few postings recently where people were asking questions and even some of the very technically savvy MVPs have been confused by this. While reading through a few of the posts, I found this entry from Paul Randal on Search Engine Q&A #12: Should you create multiple files for a user DB on a multi-core box?

The entry is great in that it tries to explain some of the reasoning behind the advice and debunk some of the advice from various sources, including MS and hardware vendors.

The bottom line is that multiple files per CPU (or core now) really applies to tempdb, when you have allocations occurring from multiple threads. And then it's like 1 file per 2 or 4 cores, something you'd want to test.

For your database files, you separate things out when it makes sense. When you can affect performance by either moving heavily used tables/indexes to separate arrays (not just separate files). Or when you can move very lightly loaded, or needed data, like archive data, out of the way (think partitioning here) to ensure that it doesn't need to be scanned when it's not needed.

 


TechEd - Execution Plans

By Steve Jones in SQL Musings | 06-13-2008 6:51 AM | Categories: Filed under: , , ,
Rating: (not yet rated) Rate this |  Discuss | 6,531 Reads | 435 Reads in Last 30 Days

Randy Dyess of Solid Quality Mentors had a session and since I'd just gotten a copy of Grant Fritchey's book with the same name, I wanted to check it out.

There are 3 types of plans: graphical, text, and XML.

Graphical is cool, looks good, but it doesn't have enough information and once you are in a non-trivial situation, you have to scroll around and it's hard to hold all that information in your mind.

Text plans are what you need to learn to read. Especially in grid mode, we deal with this format for data all the time, so this makes lots of sense for many DBAs, so spend some time working on this. Once you get used to reading text plans, likely this will be your preferred method.

XML - Also hard to read, perhaps need to know XQuery, might be worth learning if you want to know more about XML. Randy doesn't like this, but I think it's worth reading about XML a bit and learning how it works since I think we'll have this data around for a long time. However reading XML isn't easy as things lik whitespace and other characters are encoded. Therefore it's not "human readable" and needs to be consumed by something. However XQuery should help here.

All plans can generate estimated or actual plans. Sometims you can't get actuals since it would be hours to execute the query, or you can't run it on the production system and need to know, or have a good idea of what's happening. Estimated plans help here, but they are not guarenteed to be the same as the actual plan.

One problem with estimated plans is temporary tables. You might need to comment one out to get the plan, and it might not be a good representation of the final plan. If you create a permanent work table and remove it, the same thing happens.

Text Plans

In the text plans, follow the parent column in your plan. It contains numbers, and the highest number happens first. The recommendation is to start tuning with the highest numbers, tuning there first since a change there will flow down to lower numbers. An observation from Randy that SQL Server seems to do the hardest or more resource intensive things first.

Options:

  • STATISTICS PROFILE ON
  • SHOWPLAN_ALL 
  • SHOWPLAN_TEXT
  • STATISTICS IO
  • STATISTICS TIME ON

Statistics IO gives you IO statistics from the query. You get more informaiton in 2008 as this has been enhanced. This is really read and scan information that you can use to determine where you might think about tuning the query. A big thing here is to try and reduce physical reads, but you need to run this a few times and be sure that the physical reads aren't the first ones to populate the cache. Once the data is in the cache, it might perform much better. 

In this case, Randy looks to reduce logical reads. If those can be reduced, even if the data is not in cache, physical reads might be reduced.

Execution plans can also be pulled from memory. You can grab the plan from the cache, which is actually what is being used by SQL Server to execute the query. DMVs are available to help here.

How do you tune?

Randy's Goals

  • Optimize duration / CPU usage
  • Optimize IO usage

Tracking duration meaning physical clock time, helps to determine how busy the system is. It's good to know an average time for procedures to execute as a baseline. Know that this will grow over time (potentially) As you add data.

CPU usage - Track this since it's a limited resource. Knowing the top usage queries can allow you to focus your efforts on those queries that are used often and impact the CPU.

Large, complex queries usually have large batches of code, each of which can be tuned as a mini-execution plan.

Top Slowdowns

  • Table and clustered index scans
  • Index scans v index seeks
  • Bookmark lookups
  • Join methods
  • Sorts
  • Compute Scalar

Table and clustered index scans are the leading cause of bad performing queries. Not always, sometimes you want this, but it is often bad. A CI scan is a table scan really, but could have better performance since you don't have as many random IOs as on a heap. Randy has this is the number one thing to tune when he sees this. If he can remove this, it often helps. Partial scans are sometimes ok, such as scanning a set of data all grouped together.

Index scans are beter than table scans, but still costly. This usually means a lot of rows ned to be read for the query. Investigate to see if you can turn these into seeks. CI seeks are the best things to have since they hit the data, but only specific rows.

Bookmark Lookups - A non-clustered index is used, but then you must use the CI key to go read the CI and get the data. In 2005 the INCLUDE columns can help remove these. In 2005/2008, you don't see this operator. You see RID (row identifier) and then a Lookup to get the data. These are hard to resolve, you might include 3-5 columns, but not more. You want to review the columns used by indexes to see if you can cover more queries. Or perhaps see if you really need to return that column and perhaps remove the columns. Don't SELECT *, that can cause this. You might also try creating aditional indexes  to be used for joins or perhaps, if appropriate for that table, change the NCI to a CI. This shouldn't be done lightly, especially if you've spent time considering what the best CI should be.

Be sure that your NCI doesn't duplicate the CI keys. They are already included.

Join Methods - SQL Server uses 3 types. Nested loop, Merge, Hash.

Nested loops are for smaller inputs, and has low memory usage. When you have a small and large tables, this often occurs. You don't want this between two large tables. Nested loops may resort your data.

If you have two large tables, a merge join in better. It's for larger inputs, and middle in terms of memory being used.

The hash join is for large tables, uses lots of memory. A row from the first table is run through a hash algorithm and then stored. The hash buckets are stored in MemToLeave, which is only 256MB on 32 bit machines. Above that, the hash tables moves to tempdb, which is much slower.

Sorts are expensive. Don't add them unnecessarily. DISTINCT, UNION, ORDER BY, GROUP BY, aren't always needed. Don't add them out of habit. UNION ALL does not produce a sort because duplicates aren't removed. If possible, allow indexes to presort things.

Compute Scalar is a function being used by the optimizer. Try to reveiew for implicit functions and watch if you have this in a WHERE clause 

 

 


TechEd - IO and Wait Stats

By Steve Jones in SQL Musings | 06-11-2008 6:50 AM | Categories: Filed under: , , ,
Rating: (not yet rated) Rate this |  Discuss | 7,189 Reads | 359 Reads in Last 30 Days

Andrew Kelley, SQL Server MVP, is a good speaker, so I'd recommend his sessions if you want good technical information.

IO Stats are a snapshot in time since the instance was restarted, so you must have at least two reads to compare the data. Stats are a snapshot. They are not reset until the instance is restarted. This means that you must be storing this data between checks to have any meanungful data. A single read from the stats DMVs doesn't give you a reference as to what the server is doing.

NOTE: Auto close does reset the stats (not sample_ms, but the counters). Be aware of this, especially with smaller applications/databases.

Solid State DIsks - Not many people using because of costs, but more and more they are in places, tempdb, pagefile, etc.

sys.dm_io_virtual_file_stats - Stores data from all databases, holds dbid and fileid to differentiate.  Some changes in SQL 2008 that help.Some countere are separated out instead of being totals (wait stats, I think).

FileStats - Reported numbers are physical I/O, not logical. Logical can be many times larger than physical.

Don't forget tempdb. Lots of tempdb usage and it's easy to forget about it and concentrate on your user database.Often tempdb is used more than extensively than people realize and have a huge performance impact on the server.

Use FileStats with WriteLog waits to see how efficient your log writes are. Log writes are sequential writes, so these can slow down your activity. T1 must complete writing before T2, before T3. If T1 is delayed, all other transactions are delayed. Data writes can be deferred until there is more time and the changes held in memory.

Don't use multiple log files. No performance advantage for these. There is an exception in extremely large databases, but not for performance. 

Backups can skew the results. Account for them in watching stats. These are complete physical IO. Transaction log and database operations are both included here. Backups will have multiple reads. Each read can be up to xx (512kb??)  bytes. Accounting for this can be tricky. Not sure about verifying backups. Should account for reads. The account for this, you need a snapshot before and after the backups to get an idea of the reads used by the backups.

Filestats are easy to capture, so there's no excuse to examine these on a regular basis. Frequnecy might vary. Daily, weekly, what you need. Be sure you review the reports. Capturing the data doesn't help if you don't examine it. This data is important for a baseline.

Code:

DBCC DROPCLEANBUFFERS -- cleans out databases. Useful for testing.

Gather stats, build table with same structure of table, then load with snapshot from the DMV. Include the datestamp for the data.

Join with system DMVs for friendly names of files and databases.

----- 

 All these counters point to hardware issues. Not that hardware is bad, but it's a delay with OS, drivers, controllers, disk, cables, etc. Configuration can be an issue here.

 Log writes should be very short, so stalls should be < 2ms. Data writes can be longer, but they can impact things as well if they grow too much above 10ms.

Don't assume a SAN is fast. Make sure that you are checking and verifying there are not delays on those IO paths.

 The counters include aggregates from all threads. So 5s of wait or stall doesn't mean that it was 5 physical stats. If you had 5 threads, then you could have 1 physical sec of wait across each thread. Be aware of parallelism here.

Some common problems:

High stalls on writes

  • - add more write-back cache on the controller. This speeds up writes from the instance. 
  • You can change the ratio to be 100% writeback instead of read. SQL Server doesn't necessarily benefit from read caches on disk.
  • Also be sure that you are R1 or R10 or R01, not R5.
  • If possible add spindles to the array.
  • Maintain physical separation of data and logs on disks.
  • And don't put non-SQL Server traffic on the array.

High Stalls on reads

  •  optimize queries., index, tune, reduce scans or translate into seeks.
  • Same, make the controllers 100% writeback.
  • Use same disk strategies from writes. 
  • Don't forget tempdb 

 Wait Stats

 Anytime SQL Server has to wait for something, it records an event (WaitType). Time is in ms.

In 2005 there were 194 types, and more in 2008. In 2005 there wre some that were hidden, but more are exposed in 2008.

These numbers are cumulative from the last restart or the last clear. Users can reset these counters here, so be aware of this.

DMV is sys.dm_os_wait_Stats. Contains the type, the count, wait time in ms, max wait time, and signal wait time. Be aware that the max wait time is from the last reset, not between samples. Signal wait time is time from when resource is available until it is used. Wait time is from time resource needed until IS IT USED, not untiil it is available. 

 Wait stats can narrow down where bottleneck is. If there is a lot of signal time, you might have CPU pressure. This replaces dbcc sqlperf(waitstats) from SQL 2000.

dbcc sqlperf('sys.dm_os_wait_stats', clear)

Be careful of OLEDB waits.

If you report on these stats you'll see the top types of waits, which can help you diagnose where the slowdowns are in your system.

Common Waits 

CXPacket - From inefficient parallel processing, where one or more threads are waiting on things to finish. Hyperthreading can add to the problem here. Consider adjusting MAXDOP here to reduce this (at server or query level).

Locks (LCK_xxx) - Long running transactions or blocking. Tune, index, reduce contention.

ASync_NETWORKIO - client isn't getting data as quickly as SQL sends. May be network issues, but likely the client is to blame.

IO (PAGEIOLATCH, IO_COMPLETION, WRITELOG) - Physical IO waits. Waiting on disk into or from memory. Storage subsystems issues here. Writelog should be as log as possible.

Latches (pagelatch_xx) - Not IO related, this is contention with internal resources. Heaps and LOBs can cause latching waits. Lots of inserts into one page can cause issues here along with page splits.