Why create statistics on every column of every table in a database.

  • Here is the issue:

    Developers want statistics created on every column :w00t: on every table :w00t: in the database.

    SQL version = SQL Server 2005 SP2+.

    database < 200GB.

    "Auto update" and "Auto create" stats are enabled.

    There are currently 11000+ user created statistics.

    Here is the question:

    Will SQL Server ever use those statistics ??

    What other SQL process besides the optimizer uses the statistics ??

    Is there any way to determine statistic usage ??

    I need some ammo for my argument ... sock it to me.

    Enjoy
    "Give them the tools:crazy: ... Not the keys:smooooth:"

  • I think you are creating more work for yourself and will make maintenance on the database a little more difficult.

    http://www.developer.com/db/article.php/3622881/Basics-of-Statistics-in-SQL-Server-2005

    That link discusses some of the workings of statistics. Statistics are not always just created on a single column at a time. Typically the engine has a very good understanding of data usage and can create accurate statistics for use by the queries being submitted. If every table has individual statistics created on every column, you could end up with a bunch of out of date statistics that could impair performance.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • I told them there is no reason to do that for every column and every table ... but they insist.

    I need to come up with some good reasons not to do it.

    that is why I'm asking for more ammo...

    Enjoy
    "Give them the tools:crazy: ... Not the keys:smooooth:"

  • Just like indexes any additional statistics create maintenance overhead in both space used (it has to store it someplace) and the duration of maintenance (it does take time to update them.) If auto update stats is on then that can lead to slowness during the day while SQL updates stats that it may not even be using. If it's not on that means time needs to be put into developing a maintenance plan that will do that and a longer duration for maintenance. I would have to defer to someone with more knowledge on this point but I would also think that with additional statistics to consider when calculating the query plan compiling plans can take longer.

    There are also columns where creating stats will never have the opportunity to be used. If SQL isn't using that column to search then the stats won't be used. And if a column is frequently being used but isn't indexed and doesn't have a manually created stat on it then SQL can create one if auto create stats is on. We have created stats in our DB but for the most part we rely on SQL to do that when needed.

  • SQLDraggon (3/23/2010)


    I told them there is no reason to do that for every column and every table ... but they insist.

    in that case you can also ask them why they forcing you to have more statisctics.

    You can refer this link http://www.sqlservercentral.com/articles/Indexing/63534/ it might help you here

    -------Bhuvnesh----------
    I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

  • SQLDraggon (3/23/2010)


    I told them there is no reason to do that for every column and every table ... but they insist.

    Put the burden on them. Ask them to provide you with reasons why you should do it. Have them provide reasons and sources. Thus you can more easily counterpoint their demands.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Maybe this argument will help, too:

    Within one query, SQL server will only use one index per table. So, if you have the following querySELECT col1, col2, col3, col4

    FROM table

    WHERE col1='something' AND col2='something different'

    and you have an index on each and every single column then the query optimizer may not consider any of those indexes, since a bookmark lookup would be required to get all values used in the SELECT clause.

    It might be appropriate to use a single index on col1 and col2 with col3 an col4 as included columns.

    SQL server will not combine various indexes for one table within one query.

    Furthermore, it might even drop performance if the table in question is heavily used for updates/inserts/deletes, since each and every index would need to be changed.

    Edit: Incorrect answer.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • I'd ask them why they want autocreate stats on if they are going to do it themselves.

  • lmu92 (3/24/2010)


    Maybe this argument will help, too:

    Within one query, SQL server will only use one index per table. So, if you have the following query

    SELECT col1, col2, col3, col4

    FROM table

    WHERE col1='something' AND col2='something different'

    and you have an index on each and every single column then the query optimizer may not consider any of those indexes, since a bookmark lookup would be required to get all values used in the SELECT clause.

    It might be appropriate to use a single index on col1 and col2 with col3 an col4 as included columns.

    SQL server will not combine various indexes for one table within one query.

    Furthermore, it might even drop performance if the table in question is heavily used for updates/inserts/deletes, since each and every index would need to be changed.

    Your statement is false. The optimizer can use more than one index in a select statement on a single table. From this location in Books Online (ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/udb9/html/7c1f2130-5574-4058-bcfb-31c115e9bd00.htm)

    --------------------

    Do not use multiple aliases for a single table in the same query to simulate index intersection. This is no longer necessary because SQL Server automatically considers index intersection and can make use of multiple indexes on the same table in the same query. Consider the sample query:

    SELECT * FROM lineitem

    WHERE partkey BETWEEN 17000 AND 17100 AND

    shipdate BETWEEN '1/1/1994' AND '1/31/1994'

    SQL Server can exploit indexes on both the partkey and shipdate columns, and then perform a hash match between the two subsets to obtain the index intersection.

    --------------------

    I also note that this thread has nothing to do with indexes - it is about statistics.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Thanx to all that replied to this post ... your comments are greatly appreciated and helpfull.

    Enjoy
    "Give them the tools:crazy: ... Not the keys:smooooth:"

  • You're welcome.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • There's no real harm in doing what they are asking, so it's hard to come up with a compelling argument not to do it.

    My best shot would probably be to say, look, if we turn auto-create stats on, and async update stats, SQL Server will take care of it all for us. Now let's talk about something interesting, or potentially useful...and so on.

  • Paul White NZ (3/25/2010)


    There's no real harm in doing what they are asking, so it's hard to come up with a compelling argument not to do it.

    My best shot would probably be to say, look, if we turn auto-create stats on, and async update stats, SQL Server will take care of it all for us. Now let's talk about something interesting, or potentially useful...and so on.

    Not sure I agree with that. The update stats stuff (even if you go async) can hit performance pretty hard and on a busy/critical system you simply cannot affort that sometimes. I am of the "don't do unnecessary crap" mantra personally, and this certainly smells like that... :hehe:

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (3/25/2010)


    Not sure I agree with that. The update stats stuff (even if you go async) can hit performance pretty hard and on a busy/critical system you simply cannot affort that sometimes. I am of the "don't do unnecessary crap" mantra personally, and this certainly smells like that... :hehe:

    I agree with the sentiment of avoiding unnecessary stuff. But...consider:

    1. It's hard to argue that the idea is entirely daft, given that built-in facility to do it (sp_createstats)

    2. SQL Server will only update statistics if they are potentially useful to the optimizer, and found to be out of date. Statistics that are never useful will never be updated, and therefore add little overhead.

    3. If the statistics would be useful, they will get created at some stage anyway.

    Swings and roundabouts. It depends. Horses for courses. And so on.

    As I said, I think I would just convince the people concerned that 99.9% of systems are happy with auto-create and async-update stats options. Creating all the statistics would be kinda pointless, but maybe not actually dumb.

    Paul

  • >>2. SQL Server will only update statistics if they are potentially useful to the optimizer, and found to be out of date. Statistics that are never useful will never be updated, and therefore add little overhead.

    Can you please provide a reference for that statement? Thanks!

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply