footprinting a database

  • Recently started working at a small company who's previous DB folk seem to have read the don'ts of DB development as the do's. Since we've all dealt with these situations, I was curious how all y'alls tackled this. For me, since PK/FK referential integrity was at best intermittently followed at the DB level, my thought is to find/document every JOIN statement in Views/SP's/Functions/Agent jobs to ID all of key combo's.

    Before anyone starts shooting from the hip, I know this is not perfect/comprehensive or a good idea to just blindly implement - but it should give me a pretty solid idea of data relationships in the db.

    I've read about automated scripts to create key relationships, but AFAIK those rely on field name and data types matching across tables and assume tables with the same key name/datatype would match on data, but in this DB they do not always match.

  • If the DB doesn't have constraints, finding the columns they should be on will likely be the least of the problems. Fixing the data so that constraints can be added will likely be a whole lot harder.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster - Saturday, December 30, 2017 1:08 AM

    If the DB doesn't have constraints, finding the columns they should be on will likely be the least of the problems. Fixing the data so that constraints can be added will likely be a whole lot harder.

    No disagreement there, but we need to start somewhere - open to suggestions. Since the software has been built in house and looks like it is staying that way, we won't be able to just migrate away from the current DB. Current initiatives are moving everything to views/sps, and I'm looking at this idea about the joins.

  • Looking at what has been done in queries, stored procedures etc is a good start, the fact that nothing is documented is someone looking to protect themselves! Another port of call may be indexes as these may give a clue as to what should be happening and also query plans.

    ...

  • ReReplaced - Saturday, December 30, 2017 6:26 AM

    GilaMonster - Saturday, December 30, 2017 1:08 AM

    If the DB doesn't have constraints, finding the columns they should be on will likely be the least of the problems. Fixing the data so that constraints can be added will likely be a whole lot harder.

    No disagreement there, but we need to start somewhere - open to suggestions. Since the software has been built in house and looks like it is staying that way, we won't be able to just migrate away from the current DB. Current initiatives are moving everything to views/sps, and I'm looking at this idea about the joins.

    While you're going through the views, procedures, functions, etc. to find all the joins, do yourself a favor and take note of the data types of the columns involved in each predicate.  If it's as bad as you make it sound, then you'll likely have columns of different data types being joined together.  This implicit cast is devastating to performance, so you might as well make plans to fix the data types while you're identifying the joins. 

    This sounds like it's going to be a pretty serious release script.  I've had to undertake efforts like this and they aren't trivial.  There are very few absolutes in SQL Server, but my experience is that what Gail said is absolutely true.

    HappyGeek - Saturday, December 30, 2017 9:23 AM

    Looking at what has been done in queries, stored procedures etc is a good start, the fact that nothing is documented is someone looking to protect themselves! Another port of call may be indexes as these may give a clue as to what should be happening and also query plans.

    If you do have nonclustered indexes present (use sys.indexes), they may be of some help, but don't assume the person who created them knew what they were doing.  Some may be unused, which you can determine from sys.dm_db_index_usage_stats, but it's only good since the last instance restart.  You can also use sys.indexes to find heaps by using WHERE type = 0.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply