Un-queried tables

  • Hi,

    Is there a way to know which tables from a DB are NOT being queried?

    Context : We do a lot of reporting from multiple data sources (SAP, cloud based apps, etc), so from time to time we require the replication of a bunch of tables for our reporting. Our developers prepare the queries and create the reports in SSRS, Cognos, TABLEAU and POWER BI.

    Since we're limited in terms of space in our DB, we now want to list the tables that are absolutely never used/queried in those reports to delete those and free space.

    Is there a way to query this out?

    thanks a lot,

     

  • Nothing that is 100% certain.

    You would be best setting up some sort of extended event trace to capture all incoming queries then shredding the sql_text value to find the objects being touched.

    If all the objects have an index on them of some sort you could look at the index usage DMVs, but again this is not 100% certain as if you reboot the instance the DMV stats reset so you could end up dropping something which is actually in use, or if you only have a report/table that's used in a year end / quarter end / month end etc then it wont show up as being used until its touched after which you've removed the table and you have to go back a year to find it in backups etc.

    You need at least 370 days worth of intelligence to have a feel as to what is used isn't used so this isn't going to be a quick one.

     

  • We have a lot of such tables.  I use the schema of a table to help and got everyone onboard with it.

    If the table is supposed to have a really short lifetime (for example, only long enough to be used for weekly reporting), create a schema called "WK" and have people write their tables to the "WK" schema.  Then write part a stored procedure that will check the creation date of the tables in the "WK" schema and if they were created at least 6 days previous, rename them with a suffix of "_ToBeDeleted" and keep them for another 6.  This will give people a chance to realize their mistake if their code doesn't doesn't automatically create the table(s) their code creates (and IT SHOULD!).

    Do the same for a 27 day period except call the schema "MM'.

    Do the same for an 81 day period except call the schema "QQ"

    Last but not least, do the same for a 350 day period except call the schema "YY"

    Anything that must be permanent should simply not be placed in one of the above schemas.

    As Anthony pointed out in his post above, determining what you can delete from the current tables that don't follow the schema rules above is going to pose a problem.  I strongly suggest that you don't just up and drop the ones that you think are ok to drop just based on created date, last read dates, or hear-say (although the latter of that list is probably the most accurate).  Instead, create a small database to temporarily store such tables and move the tables to that DB so if someone screams bloody murder, you can quickly recover.  You can decide how much time must elapse before you drop them from that temporary database but, again, that can be a problem.  You could backup the database to a nice safe place and then just drop the database.

    The key to the schema method is going to be a written, enforceable policy that the managers agree with and then some form of reporting to identify any new tables that have been created the previous day so that you can question people about tables that are not written to one of the temporally named schemas and, if they turn out to be non-permanent, have them move the table to one of the temporal schemas and change their code to match according to the written policy that that managers have all agreed to.

    Enforcement MUST be rigid and absolute with no "but I'm in a hurry" or "I don't have time" excuses tolerated because everyone should have been informed and forced to read the policy.  Remind everyone that of the necessity along with the standard Borg warning the "Resistance is Futile" in this case. 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I don't use separate schemas (nor would I personally recommend it, given the general hassle of it -- different approaches for different folks).  How does one resolve it when the same table name is in multiple short-term schemas?  Is it the same table, perhaps the retention range having change, or a different one(s)?   Or, if I think I need a table for week, then realize it's a month, I have to change the schema and all the associated code referencing the table.  Grr, not too fond of that thought.

    Plus the associated potential permissions chaining complications from different schemas (and the very slight overhead of it).

    Instead, I use the table name itself.  If the table name begins with "tmp_" (that was traditional for this shop back in the day, so it was retained, rather than using the clearer "temp"), when created it is logged into a control table (automatically, via a DDL trigger) and will automatically be "deleted" (moved to a different db) 3 days later unless someone adjusts the control record.

    To get a default 7-day retention, we use "wk_" (same as before, "wk" rather than "work_" for backward compatibility; "wk" does NOT stand for week in our case, fwiw).

    We don't have any automatic retention longer than that (although of course db backups are retained for quite some time), but the retention period in the control table can be changed to up to 366 days.  Only IT personnel can change the retention (some power users are allowed to create tmp_ and wk_ tables, but by themselves they can't extend the table's life).

    If a table lasts more than a year, it must have a permanent name.  Tables with permanent names can, and do, go into the control table as well.  Particularly backup tables created "just in case" to avoid loss of data, such as "dbo.customer_master__BACKUP".

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • If the use of schemas gives one the heebee-jeebees, then a control table like yours or using prefixes on the table names would work just as well.  I like your idea of a control table without imparting any temporal information into the table name because that would allow you to easily change your mind about how long something should be kept in the face of changing requirements without having to change code.

    There are some problems with not using schemas, though... One of the problems I have at work is that they generate "run results" tables that have the ol' _YYYYMMDD extension on the table names because the runs occur every day and they need to keep the daily copies for a week or a month or whatever all the way up to a year (yep... I agree with what your probably thinking about that).  The date suffixes don't bother me because I can use them or ignore them and use the create date in the meta data.  I have them put such things in a different schema so that I'm absolutely sure that they're to be deleted after a week.  Since they have dates in the names, they could all live in the DBO schema but then I'd have to do something a bit more aggravating when doing things like statistics and index maintenance and selecting compression candidates, etc.  If they in that one temporal schema, everything else becomes as easy as including a WHERE OBJECT_SCHEMA_NAME(object_id) NOT IN (list of schemas to ignore).  That also means that I don't have to maintain a control table for any of this nor have anyone forget to include a call to the control table in their code to register a new table on a temporal delete schedule.

    Heh... and, yeah... I also have trap code for the ol' "tmp_TableName" and "tmpTableName" conventions.  The really "fun" part about standards is that everyone has a different one and it's a bit like trying to herd cats to get them all to settle on one... or at least limit it to a dozen. 😀

    And, nope, not disagreeing with you.  I'm just presenting problems I've had to solve and options that I've used to solve them.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Since it's based on name, I just exclude tables that start with 'tmp_', 'temp_', 'work_' or 'wk_'.  Seems just as easy as excluding schemas to me.  And I don't have the issues of separate schemas, and the increased permissions checking when the schema name changes and breaks the permission chain.

    I'm also just presenting potential problems and solutions.  I'm sure there must be other ways to deal with the issue of short-term tables.

    As to, "nor have anyone forget to include a call to the control table in their code to register a new table on a temporal delete schedule".  Again, if it's named correctly, it's automatically registered in the control table (whether the table creator wants it or not).  Of course developers could remove it from the table, but even then there's a "fail safe" process that reports anything with a temp name that's too old, whether it's in the control table or not.  SQL is nice enough to provide an object create date.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Sorry to the OP for this bit of a side issue.

    As noted by others, there's no easy, sure way to know which tables are / aren't being used.  Periodically capturing the results from index_usage_stats is probably the most practical method.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply