• I honestly have to say that, in 10 years of being a DBA and writing and tuning queries, I've never bothered with anything as complex as what you seem to be proposing to do.

    I think it would make an interesting academic exercise, but I have never needed it in any practical situation. That doesn't mean someone else hasn't needed it routinely, it just means I haven't. I can't speak for others.

    In query tuning, there are usually (every time I've ever looked) a series of simple actions that will accomplish all the tuning you're likely to need. None of them involve any complex analysis of the guts of the system, like selectivity of complex join statements.

    First, look at the code being used for the query and break down the elements in it. These top things will usually get more speed than you might expect:

    Does it use nested views with unnecessary joins, columns, etc?

    Cursors? Worse yet, nested cursors?

    Recusive UDFs?

    Procedural coding, either in the code or in UDFs that the code calls?

    Triangular joins?

    Execution plan:

    - Bookmark lookups?

    - Unneeded table/index scans?

    - Out-of-date stats?

    - Unexpected tables (usually from nested views or UDFs)?

    Sargability of Joins and Where?

    I once debugged a query that used nested cursors in a set of recursive UDFs, and took execution from 10 minutes to 10 milliseconds just by fixing that to a set-based methodology. That one was keeping anyone from logging into the company's website at all.

    I once debugged a whole server by replacing a procedural UDF with an inline "select max(col) from table where..." subquery, in a query that was key to over half of the web pages that were in use.

    In 10 years of this, I haven't had to worry about selectivity to the degree you're asking about, not once. Which column is the leading edge of an index? Yes, routinely and regularly, but based on how it's used in the query, not on its selectivity.

    So, while you can certainly find answers to the questions you've just posed, do keep in mind that they may be of limited use.

    Also keep in mind that data changes. A solution built with the expectation that a certain column has certain selectivity may be less than optimum within minutes of being implemented.

    I don't want to discourage you (or anyone) from digging into selectivity this way. I just don't want you to go into it thinking it's a key skill.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon