Understanding a Database

Question

Understanding a Database

Steve Jones - SSC Editor

SSC Guru

Points: 742804
More actions
November 29, 2018 at 9:20 pm

#369205

Comments posted to this topic are about the item Understanding a Database

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply

allinadazework SSCarpal Tunnel Points: 4507 More actions · Answer 1

What you need is a guided tour by someone who knows the database and can explain the general principles and the strengths and weaknesses. if you're lucky there is someone who has been around forever who can spend some time with you explaining the schema.
I tend to set up diagrams for any new databases I create as I think they give the best overview of the structure and you can annotate these easily enough.
A good understanding of the business area helps you get an idea as to how the tables might be structured. Sometimes if you try and suss out the key areas that the database deals with and work out in your head how you would set up the database then often chunks of the real database fall into place.
An understanding of the history of the business, db, applications etc also helps.

Robert Sterbal SSChampion Points: 11079 More actions · Answer 2

I always do table counts. I compare the baseline count with subsequent table counts.

On big tables I'll do distinct counts per column.

External descriptions of what columns are being used for are quite helpful.

412-977-3526 call/text

richardmgreen1 SSChampion Points: 10467 More actions · Answer 3

I fully agree that the vendor databases are usually less than perfect about naming conventions.

As I work in healthcare in the UK (not sure where other posters are from), we have a nationally provided dictionary of terms and descriptions and we map our fields to those.

It's not always perfect as the term for a field might not be perfectly clear in its content but 30 seconds on Google will usually give you the definition.

I also agree that diagrams are helpful, not only for explaining how tables link together, but we've also found that creating a diagram in SSMS helps when creating SSAS cubes.
We've found that, if you select the FACT table, then clicking the "include linked tables" (or whatever the field is called) also brings all the related DIM tables.

stevenb 14609 SSCommitted Points: 1835 More actions · Answer 4

If you have the option of a test environment, or can put in test data, it's easier to figure out what goes where by putting dummy data in the front end and looking to see where your dummy values end up in the database.
I usually start from an entity approach. For example, your customer. Once you figure out what your primary key for your entity is, you can start looking to see where that ID shows up in other tables to figure out what the relationships are.
Lookup values are always a fun one. We have one system that decided cities should be a lookup. It makes sense for the application because it has to store extended information about the city for what it does, but it threw me just a bit when the front end says the customer address is like 123 main street, Springfield MA, and the table data says 123 main street, 54 MA.
I regularly use a script that searches all fields in all tables in the database for specific text values. Sometimes that's the only way to track down some of these lookup values. Especially when you have the fun of a master lookup table. So that's TableID, LookupID, LookupValue.
A lot of it is just knowing how databases are supposed to be designed, and then hoping that the designer of the system followed normal design patterns. Some of it is just luck as to how much hunting you're going to have to do.

roger.plowman SSChampion Points: 10265 More actions · Answer 5

My first step is to use ModelRight's reverse engineering function to suck the database schema into a diagram. Then I spend a little time arranging the diagram to study the relationships between the tables.

Once I understand table categories (these are master tables, these are detail tables, these are code lookup tables, etc.) I start concentrating on understanding which table holds what. Alas I find third party vendors are seldom very good at relational database design.

One product didn't use clustered indexes. At all. Another decided that joins were for the weak. 🙂

And yet others don't understand that tables should have descriptive names. Sigh.

Once I have all that, then I start tracking down what individual columns store. As I do so I add comments and note fields, see if there are any consistently named columns, and so forth. In effect the ERD becomes the data dictionary, with the added advantage that it's also a pretty picture.

My last step is to use SSMS to explore the stored procedures/functions and see if they're named well. Usually--no. (shaking head). Is that approach supposed to be security by obscurity? Or just being bloody minded to outsiders?

richardmgreen1 SSChampion Points: 10467 More actions · Answer 6

<snip>
And yet others don't understand that tables should have descriptive names. Sigh.
<snip>

I feel your pain.
As an example, here are a few of our tables in the vendor database:-
AbstractData
Abstractdata2
AbstractData3

Or, even better:-
SchUkReferral
SchUkReferralMore
SchUkReferralMoreTwo

Apparently, the top table in each of the above lists is from a different section of the application, with the others containing data that should have gone into the original table but they forgot(!).

It get's even better when the tables are not normalised (and even more fun when two data items that should be identical (like a date) ..... aren't!).

Eric M Russell SSC Guru Points: 125627 More actions · Answer 7

Database modeling tools can be expensive, complicated, and despite all their bells and whistles, are ultimately rigid. There have been situations where I have questioned why a database was designed a certain way, things like inappropriate choice of datatypes or use of triggers, and the developer told me that's just how XYZ tool does it. These tools also require a 500mb download to be installed on your PC just to view and edit the project.

However, if what you're really wanting to do is create some basic entity relationship documentation for an existing database and then easily share that documentation with others, then there are some online tools that are either free or affordable. They are easy to use and have all the functionality you'd need for your purpose without all the fuss. Robert Sheldon wrote up a review on several of them a couple of years ago.
https://www.red-gate.com/simple-talk/sql/database-administration/five-online-database-modelling-services/

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

Eric M Russell SSC Guru Points: 125627 More actions · Answer 8

Documenting a legacy database is what I call "database cartography". Sometimes it's just built in a piecemeal fashion over time based on exploration and local folklore passed down from gray haired wise-men. There are plenty of areas on the map where I add the label "Here There Be Dragons". :unsure:

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

roger.plowman SSChampion Points: 10265 More actions · Answer 9

Eric M Russell - Friday, November 30, 2018 7:27 AM
Database modeling tools can be expensive, complicated, and despite all their bells and whistles, are ultimately rigid. There have been situations where I have questioned why a database was designed a certain way, things like inappropriate choice of datatypes or use of triggers, and the developer told me that's just how XYZ tool does it. These tools also require a 500mb download to be installed on your PC just to view and edit the project.
However, if what you're really wanting to do is create some basic entity relationship documentation for an existing database and then easily share that documentation with others, then there are some online tools that are either free or affordable. They are easy to use and have all the functionality you'd need for your purpose without all the fuss. Robert Sheldon wrote up a review on several of them a couple of years ago.
https://www.red-gate.com/simple-talk/sql/database-administration/five-online-database-modelling-services/

This is true, of course. It took me forever to find an ERD tool that worked the way I wanted to, and it took a few weeks to really learn it.

HOWEVER.

ModelRight (the one I really like) is a bit pricey. For the SQL-Server only version it's about $600. Having said that ModelRight paid for itself very quickly. I'm a lone-wolf developer/DBA and using this program gave me a force-multiplier equal to hiring two junior DBAs.

It isn't just about drawing pretty pictures. ModelRight is a CADD tool for database design, not only does it allow you to create the diagram proper, but use that diagram to create the scripts that create the database, alter it, and maintain a detailed data dictionary during initial design. It lets you create user-defined data types (translated to native types in scripts) that automatically set constraints and defaults, set table constraints/indexes with naming conventions you can tailor, set up foreign constraints, and allow you to script it all with a single keystroke.

I operate on a severely constrained budget, but tools like this cost a pittance when compared to the salaries of two DBAs--and it's a sunk cost, not a recurring one.

I pick my tools carefully. Between ModelRight and Redgate's SQL Compare/SQL Data Compare it's like having a construction crew with heavy equipment building a canal vs me on my lonesome with a teaspoon...

The fact I can reverse-engineer third-party databases is just icing on the cake. Since, you know, I needed the tool for development anyway.

primitivefuture2006 SSC-Addicted Points: 439 More actions · Answer 10

Eric M Russell - Friday, November 30, 2018 7:27 AM
Database modeling tools can be expensive, complicated, and despite all their bells and whistles, are ultimately rigid. There have been situations where I have questioned why a database was designed a certain way, things like inappropriate choice of datatypes or use of triggers, and the developer told me that's just how XYZ tool does it. These tools also require a 500mb download to be installed on your PC just to view and edit the project.
However, if what you're really wanting to do is create some basic entity relationship documentation for an existing database and then easily share that documentation with others, then there are some online tools that are either free or affordable. They are easy to use and have all the functionality you'd need for your purpose without all the fuss. Robert Sheldon wrote up a review on several of them a couple of years ago.
https://www.red-gate.com/simple-talk/sql/database-administration/five-online-database-modelling-services/

I read that article you linked just now. I was wondering, do any of those listed software create the ER diagrams on their own? Or do they rely on the user to draw and input the column names and keys manually? It's hard to tell from reading the article. Thanks!

Eric M Russell SSC Guru Points: 125627 More actions · Answer 11

primitivefuture2006 - Friday, November 30, 2018 8:02 AM
Eric M Russell - Friday, November 30, 2018 7:27 AM
Database modeling tools can be expensive, complicated, and despite all their bells and whistles, are ultimately rigid. There have been situations where I have questioned why a database was designed a certain way, things like inappropriate choice of datatypes or use of triggers, and the developer told me that's just how XYZ tool does it. These tools also require a 500mb download to be installed on your PC just to view and edit the project.
However, if what you're really wanting to do is create some basic entity relationship documentation for an existing database and then easily share that documentation with others, then there are some online tools that are either free or affordable. They are easy to use and have all the functionality you'd need for your purpose without all the fuss. Robert Sheldon wrote up a review on several of them a couple of years ago.
https://www.red-gate.com/simple-talk/sql/database-administration/five-online-database-modelling-services/
I read that article you linked just now. I was wondering, do any of those listed software create the ER diagrams on their own? Or do they rely on the user to draw and input the column names and keys manually? It's hard to tell from reading the article. Thanks!

One of them, I forget which, let's you upload a sql script containing your DDL (ie: CREATE TABLE) statements, and it will then build a starting model from that.

I see these (online) tools as more for documentation or logical design. I personally don't need a tool to generate DDL for me; I do it fast enough by hand. The thought process that goes into hand coding DDL statements results in a better physical model. Most DBAs and database developers don't often get an opportunity to create a new database, so the focus should be on getting the design right. Cranking out a good enough design as fast as possible, or passing off the task to a team member who isn't really proficient at database design, that's why many legacy databases suck in the first place.

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

mfagan12 SSCommitted Points: 1518 More actions · Answer 12

I've been at my current job for over 12 years (it's the only job I've had where I haven't gotten bored because I get to do both application development and database development) and one of the first programs that a user had an issue with was named, UPR00100. The developer I had been hired to replace, had been here 6 years, and that's what he named the program--which was named after a table it used in a database, which was used by HR for employee changes. Ugh.

Over the years, I have made it a point to name fields that are self-documenting and have emphasized to developers I manage to name the variables in their code to match the database fields as well. That has been working well, for the most part.

Since I work in a small shop, doing both database and front-end development, it's made it easier for me to figure out everything about a database. I've even gone and spent time with the end users to have them explain the processes to me, which has helped in determining what the oddly named fields in some of the databases are there for, etc. Now I know everything off the top of my head. I can only imagine how difficult it would be in an environment where I didn't have that sort of access.

We also have to work with an iSeries database--8 characters in the table and field names. The field names are pretty self-documenting but the table names--not even close.

My boss does want us to get a modeling tool and has mentioned ER/Win and I've stumbled upon ER Studio and so probably within the next 6 months, we'll be utilizing one of those.

primitivefuture2006 SSC-Addicted Points: 439 More actions · Answer 13

roger.plowman - Friday, November 30, 2018 7:03 AM
My first step is to use ModelRight's reverse engineering function to suck the database schema into a diagram. Then I spend a little time arranging the diagram to study the relationships between the tables.
Once I understand table categories (these are master tables, these are detail tables, these are code lookup tables, etc.) I start concentrating on understanding which table holds what. Alas I find third party vendors are seldom very good at relational database design.
One product didn't use clustered indexes. At all. Another decided that joins were for the weak. 🙂
And yet others don't understand that tables should have descriptive names. Sigh.
Once I have all that, then I start tracking down what individual columns store. As I do so I add comments and note fields, see if there are any consistently named columns, and so forth. In effect the ERD becomes the data dictionary, with the added advantage that it's also a pretty picture.
My last step is to use SSMS to explore the stored procedures/functions and see if they're named well. Usually--no. (shaking head). Is that approach supposed to be security by obscurity? Or just being bloody minded to outsiders?

So are you saying ModelRight produces the ER diagrams and documentation by itself, without me having to input any names, keys, or SQL scripts? If that's the case, such a tool is worth the $ in sunken cost.

roger.plowman SSChampion Points: 10265 More actions · Answer 14

primitivefuture2006 - Friday, November 30, 2018 9:06 AM
roger.plowman - Friday, November 30, 2018 7:03 AM
My first step is to use ModelRight's reverse engineering function to suck the database schema into a diagram. Then I spend a little time arranging the diagram to study the relationships between the tables.
Once I understand table categories (these are master tables, these are detail tables, these are code lookup tables, etc.) I start concentrating on understanding which table holds what. Alas I find third party vendors are seldom very good at relational database design.
One product didn't use clustered indexes. At all. Another decided that joins were for the weak. 🙂
And yet others don't understand that tables should have descriptive names. Sigh.
Once I have all that, then I start tracking down what individual columns store. As I do so I add comments and note fields, see if there are any consistently named columns, and so forth. In effect the ERD becomes the data dictionary, with the added advantage that it's also a pretty picture.
My last step is to use SSMS to explore the stored procedures/functions and see if they're named well. Usually--no. (shaking head). Is that approach supposed to be security by obscurity? Or just being bloody minded to outsiders?
So are you saying ModelRight produces the ER diagrams and documentation by itself, without me having to input any names, keys, or SQL scripts? If that's the case, such a tool is worth the $ in sunken cost.

From an existing database? Yes, absolutely. Anything in the database that was created with DDL scripts can be automatically reverse engineered into a new ER diagram.

This includes:
*Table and column names (including any table/column comments that used the MS_Description extended property)
*All table constraints
*All column constraints (including defaults)
*All indexes (including clustered key index, and all non-clustered indexes)
*All table relationships (including cardinality)
*Database roles

The things you have to add after the reverse-engineering creates the ERD include any domains (user defined data types), comments (both notes (ERD only) and comments for anything that didn't have a MS_Description extended property.

Once you have a nicely commented ERD ModelRight can then turn around and create an empty database with the original schema plus notes that can be viewed with SSMS.

In my opinion ModelRight is absolutely worth buying, but be aware if you want to get the full benefit there's a fairly steep learning curve and the documentation on the program is only meh.