Statistics Enhancements in SQL Server 2012

, 2012-02-29

Today I want to talk about Statistics Enhancements in SQL Server 2012. As you already

know SQL Server uses Statistic Objects in a database to estimate the cardinality of

queries. The cardinality estimation is the input for the generation of the physical

Execution Plan of a given T-SQL query. Data changes over time, data distribution changes

over time, so SQL Server is also able to change Statistics Objects in the background

automatically for us. This feature is called Auto Update Statistics.

SQL Server updates Statistics automatically under the following conditions (when the

table is not a temp table):

  • A table with no rows gets a row
  • A table has fewer than 500 rows and is increased by 500 or more rows
  • A table has more than 500 rows and is increased by 500 rows + 20 percent of the number

    of rows

Whenever SQL Server is updating the statistics automatically, SQL Server Profiler

reports the Auto Stats event. On the other hand, SQL Server is also

able to create Statistic Objects on the fly, when SQL Server needs to estimate the

cardinality for a given query and no index (and therefore no Statistics Object) is

available, like in the following one:

SELECT * FROM Table1

WHERE Column1 > 1

GO

When Column1 is not indexed, SQL Server will create a new Statistics

Object for you.

All these things are great to improve performance for queries, but what happens when

SQL Server can't change the underlying Statistics Object or isn't able to create a

new one for you? Just think of databases that are marked as read only, or when you

are using Database Snapshots. Data in Database Snapshots can't change, but you can

issue completely new queries that leads to the demand of creating new Statistics Objects.

In that case, SQL Server can't create/update the Statistics Object, which means SQL

Server can't re-evaluate the cardinality estimation and proceeds with the cached execution

plan, which can have a huge performance drawback for your read only workload. Let's

demonstrate this problem with a simple example. I'm creating a new database with a

simple table in it:

USE master

GO

CREATE DATABASE StatisticsDatabase

GO

USE StatisticsDatabase

GO

CREATE TABLE Table1

(

Column1 INT IDENTITY,

Column2 INT

)

GO

In the next step I'm populating the table with 1500 records:

SELECT TOP 1500 IDENTITY(INT, 1, 1) AS n INTO #Nums

FROM

master.dbo.syscolumns sc1

INSERT INTO Table1 (Column2)

SELECT n FROM #nums

DROP TABLE #nums

GO

When we select everything from the table, SQL Server uses the Table Scan operator,

because we have not defined a Clustered Index on our table:

SELECT * FROM Table1

GO

To demonstrate the problem with errors in the cardinality estimation I'm creating

a Non-Clustered Index on Column2.

CREATE NONCLUSTERED INDEX idxTable1_Column2 ON Table1(Column2)

GO

The following query uses the previous created Non-Clustered Index:

SELECT * FROM Table1

WHERE Column2 = 2

GO

SQL Server creates an Execution Plan with a Non-Clustered Index Seek and RID Lookup

(Heap) operator – the traditional Bookmark Lookup operator. By now we have a table

with 1500 records, which means we need 20 percent + 500 rows of data changes, until

SQL Server will update the underlying Statistics Object of the Non-Clustered Index.

In our case these are 800 data changes in our table (300 + 500). So let's insert 800

additional records into the table:

SELECT TOP 800 IDENTITY(INT, 1, 1) AS n INTO #Nums

FROM

master.dbo.syscolumns sc1

INSERT INTO Table1 (Column2)

SELECT 2 FROM #nums

DROP TABLE #nums

GO

SQL Server will update the Statistics Object as soon as a new SELECT query will reference

it. Prior to that we will now mark our database as read only, so that SQL Server can't

change anything anymore:

USE master

GO

-- Mark the database

as readonly

ALTER DATABASE StatisticsDatabase

SET READ_ONLY WITH NO_WAIT

GO

USE StatisticsDatabase

GO

Let's now run the query which will trigger an update of the Statistics Object:

SELECT * FROM Table1

WHERE Column2 = 2

GO

Under normal conditions, SQL Server will now re-evaluate the cardinality estimation

and create an Execution Plan with a Table Scan operator. But in our case SQL Server

can't change anything in the database, therefore SQL Server will reuse the inefficient

cached Execution Plan with the Bookmark Lookup! When you look at the Execution Plan

you will also see a huge difference between the Estimated and Actual Number of Rows:

That's the behavior of SQL Server 2005/2008 (R2), and you can't change anything about

it. The same problem will occur when you are using Database Snapshots.

But with SQL Server 2012 Microsoft has addressed that problem. SQL Server now supports

so-called Temporary Statistic Objects which are created in TempDb.

Every time when SQL Server wants to create or update an Statistics Object on a read

only database or on a Database Snapshot, SQL Server will create the necessary Statistics

Object in TempDb. For this functionality SQL Server needs 1 data page per each Statistics

Object in TempDb.

So when you run the previous code on SQL Server 2012, SQL Server will create a new

temporary Statistics Object in TempDb, and will create an Execution Plan with a Table

Scan operator, which makes much more sense, because our query is not selective enough

anymore.

Temporary Statistics Objects are also used by Readable Secondaries in AlwaysOn in

SQL Server 2012, because Readable Secondaries also have the same problem: they are

used for read only workloads, and SQL Server can't change anything in the Secondary

Replica, because all Replicas must have the same physical structure in the Availability

Group. So every time when you are working with readable Secondaries in SQL Server

2012, you will also have an overhead in TempDb, because of temporary Statistics. And

Readable Secondaries are also using transparently Snapshot Isolation, which also impacts

the performance of TempDb. But that's a topic for another weblog posting J.

Thanks for reading!

-Klaus

Rate

Share

Share

Rate

Related content

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

Question: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? This question was sent to me via email. My reply follows. Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? Databases to be mirrored are currently running on 2005 SQL instances but will be upgraded to 2008 SQL in the near future.

2009-02-23

1,567 reads

Networking - Part 4

You may want to read Part 1 , Part 2 , and Part 3 before continuing. This time around I'd like to talk about social networking. We'll start with social networking. Facebook, MySpace, and Twitter are all good examples of using technology to let...

2009-02-17

1,530 reads

Speaking at Community Events - More Thoughts

Last week I posted Speaking at Community Events - Time to Raise the Bar?, a first cut at talking about to what degree we should require experience for speakers at events like SQLSaturday as well as when it might be appropriate to add additional focus/limitations on the presentations that are accepted. I've got a few more thoughts on the topic this week, and I look forward to your comments.

2009-02-13

360 reads