Rebuild index versus Update Statistics

Question

Rebuild index versus Update Statistics

Robert Frasca

SSCertifiable

Points: 6027
More actions
May 5, 2010 at 7:43 am

#224111

I'm seeing an issue that baffles me so I'm hoping someone can confirm my suspicion.
I have a query that's performing poorly. When I rebuild the indexes, the query continues to perform poorly. When I UPDATE STATISTICS WITH FULLSCAN on a specific table, the problem goes away.
I know that rebuilding indexes also rebuilds the statistics.
My question is this: Is it possible to override the default sampling for the statistics when rebuilding indexes or must I have a separate job to perform the FULLSCAN sampling?
"Beliefs" get in the way of learning.

Viewing 12 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 1

Index rebuilds (not index reorganise) always do a fullscan update of the statistics. The entire index has to be read to rebuild it, so there's no reason to do a sampled stats update.

It may be that you have parameter sniffing (and the stats forces a recompile).

If this happens again:

* Check the statistics before you rebuild, see if the last update was sampled or fullscan (DBCC SHOW_STATISTICS). If it's sampled, it means that an UPDATE STATISTICS, sp_updatestats or auto update happened after the index rebuild

* See if a recompile of the procedure fixes things without a stats update. If it does, it may be parameter sniffing, not stale stats.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Robert Frasca SSCertifiable Points: 6027 More actions · Answer 2

Thanks for the reply. I tried the recompile thing without luck. Even though we rebuild the indexes every night, the only thing that makes the problem go away is the UPDATE STATISTICS WITH FULLSCAN.

Parameter sniffing hadn't occurred to me but the fact that it only occurs on some client sites and not on others has me wondering. Unfortunately, I can't re-create the problem and my manangement frowns on stored procedure changes in the field, even as a hotfix.

I'm aware of how to fix a parameter sniffing problem, i.e. move the parameter to a local variable and reference the local variable in the query, but I'll have to convince my management to try it as a hotfix after I make the change.

"Beliefs" get in the way of learning.

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 3

Next time it happens, please, before you update the statistics, run a DBCC SHOW_STATISTICS and check if the last update was sampled or fullscan (rows compared with rows sampled). It may be that something's doing a sampled update after the index rebuild.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Robert Frasca SSCertifiable Points: 6027 More actions · Answer 4

Okay, this problem has taken on a new dimension. Even when I rebuild the statistics the query still performs poorly on one database but works fine on the seven or eight other databases I've tried. Any help on resolving why this refuses to do an index seek would be appreciated.

I tried "disabling" parameter sniffing by placing the parameter value in a local variable and using the local variable in the query but that didn't work. I get the same behavior. I ran DBCC SHOW_STATISTICS on most of the indexes being scanned and the statistics are all based on FULLSCAN's and were rebuilt last night.

I'm attaching a copy of the stored proc and the query plan that is produced when it is running poorly. You will note the index scan of the Patients table which seems to ripple through the plan. This is an index seek on all of the other databases. There is a view referenced (vw_PatInfoData) that is an abstraction of a sloppy data structure that does recursive self-joins but there are only 23 rows in the table that is showing up as a table scan. If you want, I'll provide a copy of the view definition but I don't think that's the problem.

"Beliefs" get in the way of learning.

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 5

That is huge. No way I can do a full analysis of that.

One thing that may be worth trying... There's one place where the estimated and actual rows are differnt, in the table Patient_Info_Fields. It's got no clustered index, no usable NC index. May I suggest trying to add an index on that - (Entry_Sequence, Patient_Info_Field_ID), and make sure that stats are updated on that table.

It may be worth a bit of divide and conquer. Insert the rows that you need from the view into a temp table, index it then join that in. Worth a try if nothing else.

What does a good plan look like? There's nothing I can see that SQL could use as a seek predicate on the patients table.

Why is everything with nolock? Are you aware of the possible data-correctness problems with that (and I don't just mean uncommitted data)

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Robert Frasca SSCertifiable Points: 6027 More actions · Answer 6

Thanks for the suggestion. That is the table with 23 rows in it and I tried re-working the view (to reference the primary key) that is doing the recursive joins on that table but, since all of the rows are probably on one page anyway, I'm not having much luck. What baffles me is that it works fine in most places and I can't find a bad guy in the usual places, i.e. statistics, indexes etc. I'll just have to keep digging I guess.

Don't get me started on NOLOCK. My first day on the job I told them that it was a poor practice but they got burned by locking issues in SQL Server 2000 due to a poor database design and their solution was to throw NOLOCK everywhere. (I wish I had a dollar for every time I've encountered that!) In fact, they have turned on READ COMMITTED SNAPSHOT and when I demonstrated that using NOLOCK renders that feature useless they still weren't willing to take a chance on changing that practice. My current plan is to continue to lobby and provide examples of how much it is hurting them. They are getting complaints from the field that customers are seeing unpredictable results but they still don't believe me when I tell them it's due to dirty reads. Frankly, I was a little surprised by their unwillingness even to talk about it. By the way, it is management that won't let us change it. The developers are on board.

"Beliefs" get in the way of learning.

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 7

23 rows... How big are the rows? If all 23 will fit on a single page, then you're right, there's no point.

Have you tried the divide and conquer? Insert results of view into temp table, use temp table in query? Sometimes it helps a lot (and sometimes it makes things worse)

I can dig up a lot of blog posts from SQL experts that demonstrate the consistency issues with nolock clearly. Think it'll help?

Do you work there or are you a consultant?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Robert Frasca SSCertifiable Points: 6027 More actions · Answer 8

GilaMonster (5/7/2010)
23 rows... How big are the rows? If all 23 will fit on a single page, then you're right, there's no point.
Have you tried the divide and conquer? Insert results of view into temp table, use temp table in query? Sometimes it helps a lot (and sometimes it makes things worse)
I can dig up a lot of blog posts from SQL experts that demonstrate the consistency issues with nolock clearly. Think it'll help?
Do you work there or are you a consultant?

Now that I've looked more carefully, the rows are a little over a 1,000 bytes each, most of that in one varchar(1000) field so it probably covers three pages. I'll try the temp table thing. Shouldn't take too long to try it out. I've been using that approach to replace some of the table-valued functions here but that's a hard sell as well.

I appreciate the offer about NOLOCK. I've got a few sources but maybe you can point me to someone I haven't discovered yet. If nothing else, I'll find some new sources for knowledge.

I've been a consultant for many years but I took this full-time gig about six months ago hoping to stay in one place for awhile. Haven't decided whether I like it or not yet. As you can see, there are some frustrations that are difficult to live with but I keep telling myself that it takes a while for the Titanic to turn.

Thanks again for your help.

"Beliefs" get in the way of learning.

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 9

Robert Frasca (5/7/2010)
Now that I've looked more carefully, the rows are a little over a 1,000 bytes each, most of that in one varchar(1000) field so it probably covers three pages. I'll try the temp table thing. Shouldn't take too long to try it out. I've been using that approach to replace some of the table-valued functions here but that's a hard sell as well.

Try the index as well, not just to rind the rows faster, but also so that SQL can retrieve them in order if it needs and maybe do a merge join instead of a loop join (or maybe flip the order of the tables in the join around)

I appreciate the offer about NOLOCK. I've got a few sources but maybe you can point me to someone I haven't discovered yet. If nothing else, I'll find some new sources for knowledge.

This is the one I keep handy. http://sqlblog.com/blogs/andrew_kelly/archive/2009/04/10/how-dirty-are-your-reads.aspx There are others, I'll hunt, I kinda know where to look, so should be quick.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

MMartin1 One Orange Chip Points: 27860 More actions · Answer 10

Robert Frasca wrote:

Now that I've looked more carefully, the rows are a little over a 1,000 bytes each, most of that in one varchar(1000) field so it probably covers three pages. I'll try the temp table thing. Shouldn't take too long to try it out. I've been using that approach to replace some of the table-valued functions here but that's a hard sell as well

I know this thread is old but I ran into a similar issue recently and wanted to add a thought here :

The that table you mention may have few rows, but when it joins to other tables in a cartesian way, those values will be repeated countelss times. The fact that this 'other' table has this large varying character fields (maybe with bad or outdated statistics, again 'other' table) then you can produce large memory grants which can fill up your temp table, especially if sorting. That can bring things to a crawl.

So the fact that a seek went to a scan could cause a large variation in performance when a large field is at play.

----------------------------------------------------

MMartin1 One Orange Chip Points: 27860 More actions · Answer 11

Robert Frasca wrote:

Now that I've looked more carefully, the rows are a little over a 1,000 bytes each, most of that in one varchar(1000) field so it probably covers three pages. I'll try the temp table thing. Shouldn't take too long to try it out. I've been using that approach to replace some of the table-valued functions here but that's a hard sell as well

I know this thread is old but I ran into a similar issue recently and wanted to add a thought here :

The that table you mention may have few rows, but when it joins to other tables in a cartesian way, those values will be repeated countless times. The fact that this 'other' table has this large varying character fields (maybe with bad or outdated statistics, again 'other' table) then you can produce large memory grants which can fill up your temp db, especially if sorting. That can bring things to a crawl.

So the fact that a seek went to a scan could cause a large variation in performance when a large field is at play.

----------------------------------------------------