Statistics (quickly) out of date

  • Hello. Today I had an issue where an export out of a DB was taking all day. I finally found that SQL was estimating a row count of 1 on a 100M row table. It turns out that the statistics were out of date. Thru DBCC SHOWSTATISTICS it's easy to find where your most recent values are out of range on a date column. But my weekly update_stats job ran fine 3 nights ago. So my question is how often do we have to do this? I know in SQL 2012 there's a better formula for deciding when to auto_update statistics. I was really surprised that they got out of date so quickly. When current values are outside your histogram, is SQL doing a table scan? It seems like it.

    Ken

  • ken.trock (6/23/2015)


    So my question is how often do we have to do this?

    As often as necessary. I've heard of a case where that was every hour.

    Nothing wrong with having jobs which do manual stats updates when the auto update isn't adequate.

    When current values are outside your histogram, is SQL doing a table scan? It seems like it.

    No. It does whatever the optimiser decides is most optimal for an estimation of 1 row. A table scan is fairly benign. A index seek + key lookup tends to be much less so

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • To follow up on what Gail said, the issue of poor estimations due to out of date statistics on ascending columns is a fairly well-known problem. https://www.simple-talk.com/sql/database-administration/statistics-on-ascending-columns/ is a good read on the subject.

    On a table with 100 million rows, that might combine unhelpfully with the default auto update threshold, which tends to require too many modifications before it updates statistics on large tables. Microsoft documented that and the trace flag that changes that behavior here: https://support.microsoft.com/en-us/kb/2754171. Basically, the flag changes the threshold to be dynamic, so that the required change percentage drops as the table gets larger.

    As Gail said, you should be able to just update your stats more frequently. The trace flags around these behaviors are not usually necessary, but it's good to be aware of the default behavior.

    Cheers!

  • just keep in mind that updating statistics causes queries to recompile.There is a performance tradeoff between improving query plans and the time it takes to recompile queries

  • Thanks everyone. Even on a 120M row table SQRT(1000*<num rows>) is only a 350k row threshold. I'm bringing in at least that many every night. So not sure why they got out of date.

    No matter, I took the 2 UPDATE STATISTICS statements I used to fix this and put them at the end of this table's ETL. Ran in 4 seconds (a bargain at twice the time!!), sampling < 500k rows each time. That rate seems to be sufficient.

    Ken

  • ken.trock (6/26/2015)


    Thanks everyone. Even on a 120M row table SQRT(1000*<num rows>) is only a 350k row threshold. I'm bringing in at least that many every night. So not sure why they got out of date.

    The default threshold 20%+500, so for a 120 mil row table, the default update threshold is just over 24 million, not 350k. Or do you have traceflags on(can't recall off hand which does what)?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • In 2012 you still have to enable 2371 to get the new auto update behavior. It's easy to confirm following the same methodology as used here: http://www.sqlskills.com/blogs/erin/understanding-when-statistics-will-automatically-update/.

    I just ran a quick test where I populated a table with 1,000,000 rows on a 2012 SP2 server. Statistics did not auto-update until 200,500 modifications (following 20%+500). With 2371 enabled, it updated after 31,622 modifications (following the new formula SQRT(1000*Rows)).

    Cheers!

  • Jacob Wilkins (6/26/2015)


    In 2012 you still have to enable 2371 to get the new auto update behavior. It's easy to confirm following the same methodology as used here: http://www.sqlskills.com/blogs/erin/understanding-when-statistics-will-automatically-update/.

    I just ran a quick test where I populated a table with 1,000,000 rows on a 2012 SP2 server. Statistics did not auto-update until 200,500 modifications (following 20%+500). With 2371 enabled, it updated after 31,622 modifications (following the new formula SQRT(1000*Rows)).

    Cheers!

    Maybe the original poster doesn't actually have that trace flag set and is assuming the new formula is in play?

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • sgmunson (6/26/2015)


    Jacob Wilkins (6/26/2015)


    In 2012 you still have to enable 2371 to get the new auto update behavior. It's easy to confirm following the same methodology as used here: http://www.sqlskills.com/blogs/erin/understanding-when-statistics-will-automatically-update/.

    I just ran a quick test where I populated a table with 1,000,000 rows on a 2012 SP2 server. Statistics did not auto-update until 200,500 modifications (following 20%+500). With 2371 enabled, it updated after 31,622 modifications (following the new formula SQRT(1000*Rows)).

    Cheers!

    Maybe the original poster doesn't actually have that trace flag set and is assuming the new formula is in play?

    True, I thought the new formula was by default. This Microsoft Engineer seems to imply that: http://blogs.msdn.com/b/srgolla/archive/2012/09/04/sql-server-statistics-explained.aspx

    Anyhow, it seems the trace flag is needed.

    Ken

  • ken.trock (6/26/2015)


    sgmunson (6/26/2015)


    Jacob Wilkins (6/26/2015)


    In 2012 you still have to enable 2371 to get the new auto update behavior. It's easy to confirm following the same methodology as used here: http://www.sqlskills.com/blogs/erin/understanding-when-statistics-will-automatically-update/.

    I just ran a quick test where I populated a table with 1,000,000 rows on a 2012 SP2 server. Statistics did not auto-update until 200,500 modifications (following 20%+500). With 2371 enabled, it updated after 31,622 modifications (following the new formula SQRT(1000*Rows)).

    Cheers!

    Maybe the original poster doesn't actually have that trace flag set and is assuming the new formula is in play?

    True, I thought the new formula was by default. This Microsoft Engineer seems to imply that: http://blogs.msdn.com/b/srgolla/archive/2012/09/04/sql-server-statistics-explained.aspx

    Anyhow, it seems the trace flag is needed.

    Ken

    Yeah, I remember getting quite excited when I read that article many moons ago, also inferring that it might be default. Unfortunately, I quickly saddened when tests revealed the new behavior was not by default. At least the trace flag doesn't require a restart.

    Cheers!

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply