Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Statistics Explained


Statistics Explained

Author
Message
Lynn Pettis
Lynn Pettis
One Orange Chip
One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)

Group: General Forum Members
Points: 26480 Visits: 38127

Without going out and buying yet another book on SQL Server 2005 by Kalen Delaney, can any one provide me with a good explaination of statistics and how the work and are used by the query engine without going into a lot of detail?

Thanks



Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Adam Wilbur
Adam Wilbur
Ten Centuries
Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)

Group: General Forum Members
Points: 1052 Visits: 591

Here's my brief answer...

Statistics are simply data SQL Server collects about a column's data selectivity, uniqueness and counts of particular values in that column. Stats exist on every index, however stats can also exist on columns that are not indexed.





A.J.
DBA with an attitude
Steve Jones
Steve Jones
SSC-Forever
SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)SSC-Forever (40K reputation)

Group: Administrators
Points: 40450 Visits: 18846
Adam has a good explanation, but I'll add mine:

Statistics are a way of guessing what the distribution of data is in a column. So if you have age, SQL "samples" the data and gets some idea of how many people are < 10, 11-20, 21-30, etc. Note these are my numbers, not really what SQL would use. It would just have an "idea" of the distribution. That way it can "guess" whether an index will help the query run quicker or if it's better to just scan the table.

Follow me on Twitter: @way0utwest
Forum Etiquette: How to post data/code on a forum to get the best help
My Blog: www.voiceofthedba.com
Lynn Pettis
Lynn Pettis
One Orange Chip
One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)

Group: General Forum Members
Points: 26480 Visits: 38127

Along these lines, how much effort should I put in to understand statistics? Would getting Inside Microsoft SQL Server 2005: Query Tuning and Optimization be a worthwhile investment? Just curious as I have invested a lot in a variety of SQL Server 2005 books over the last few months.

Thanks.



Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Michael Earl-395764
Michael Earl-395764
Hall of Fame
Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)

Group: General Forum Members
Points: 3011 Visits: 23078

If you plan to be a DBA for a significant portion of time, invest the time in learning about query tuning. I would recommend you spare the week to take a class if you can get your company to pay for it.

It will do two things that are really important. First, it will teach you how to performance tune and troubleshoot slow queries in your databases. Second, it will give you a much better understanding of what SQL Server is doing with the query you have passed it so you can more intelligently make decisions on what to do and what not to do.

You may never specifically use performance tuning and query optimization, but understanding how SQL creates execution plans will be a great help in database design if you have to build anything new.


Lynn Pettis
Lynn Pettis
One Orange Chip
One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)

Group: General Forum Members
Points: 26480 Visits: 38127

I think I have a pretty good understanding about building and tuning queries, and the use and creating of indexes. What I am not sure about is statistics. In working on slow performance of our PeopleSoft Finance system after upgrading to SQL Server 2005, I ran an hour long trace and used that with DTA. It recommended we build 35 indexes and a lot more multi-column statistics. We have implemented those recommendations and have seen a significant improvement in performance. I can understand how the indexes help, but how do the statistics? If you are doing this manually, how do you decide if a statistic is more appropriate than index? Without going into a 3 to 30 page explaination, how are statistics used to improve query performance?

Or should I invest in the book I mentioned earlier by Karen Delaney?

Also, I work for a school district, training funds are very sparse.



Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
matt stockham
matt stockham
SSC Eights!
SSC Eights! (810 reputation)SSC Eights! (810 reputation)SSC Eights! (810 reputation)SSC Eights! (810 reputation)SSC Eights! (810 reputation)SSC Eights! (810 reputation)SSC Eights! (810 reputation)SSC Eights! (810 reputation)

Group: General Forum Members
Points: 810 Visits: 3178

In my limited understanding : statistics are used by the optimiser to determine the most efficient query plan, including which indexes to use and how. Using an index to identify rows might be extremely efficient if it's a unique index on a large table, but extremely inefficient if there is only a small number of distinct values in that index. On non-indexed fields the stats presumably help the optimiser determine what kinds of join are appropriate and which order to process each element in the plan.

If you haven't done so already, take a look at the properties for some of the stats in your tables - it helps to visualise the kind of information they represent.

I think statistics are definitely one of the least-understood aspects of performance tuning. Not sure about the book as I haven't read it, but I would expect there to be some blogs about stats somewhere.


Lynn Pettis
Lynn Pettis
One Orange Chip
One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)One Orange Chip (26K reputation)

Group: General Forum Members
Points: 26480 Visits: 38127

Based on that, would you say it is worth learning more about statistics? I am just curious how much time and effort I should put into some of this or if I should just "put my faith in Microsoft" and that DTA knows what's best.

Thanks



Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
matt stockham
matt stockham
SSC Eights!
SSC Eights! (810 reputation)SSC Eights! (810 reputation)SSC Eights! (810 reputation)SSC Eights! (810 reputation)SSC Eights! (810 reputation)SSC Eights! (810 reputation)SSC Eights! (810 reputation)SSC Eights! (810 reputation)

Group: General Forum Members
Points: 810 Visits: 3178

I think it's handy to understand if you have the time and inclination, but for the most part you don't need to know it in detail - just knowing that they are beneficial is probably sufficient. Having said that, it's another tool to use in performance tuning and I have occasionally made queries run faster due to that knowledge. If you just want to get your toes wet, knowing when statistics become out of date is useful.

Here's a document I found:

http://www.microsoft.com/technet/prodtechnol/sql/2005/qrystats.mspx


GilaMonster
GilaMonster
SSC Guru
SSC Guru (54K reputation)SSC Guru (54K reputation)SSC Guru (54K reputation)SSC Guru (54K reputation)SSC Guru (54K reputation)SSC Guru (54K reputation)SSC Guru (54K reputation)SSC Guru (54K reputation)

Group: General Forum Members
Points: 54266 Visits: 44637

I am just curious how much time and effort I should put into some of this or if I should just "put my faith in Microsoft" and that DTA knows what's best.

The DTA does not always know best. I've found that it tends to go overboard.

When I use it (which isn't often) what I'll generally do is implement its suggestions 1 by 1 and test the performance improvement, then I implement the ones that have the most impact.



Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search