Blog Post

Minimalist Design - Part 3

,

One of the labs I frequently do with students is to design a database for a training business, something which I know something about and can add some challenges as the lab progresses. Students tend to either under or over normalize, though of course that's all in your point of view!

As we work through it I remind them that the cost of development goes up with each join. Yes, SQL Server is good at joins and most of write them without problem, and yes, a normalized database is a good thing. But. As a total guess, I'd say that each join adds at least 10 hours work to a project. Remember that in most cases joins require another form/page/control/screen to support it administratively (you do build admin tools don't you?) and those take time - time to design, code, build, test, revise, deploy, etc.

So let's say you're in the common case of needing to get something to market very quickly that is as good as possible, but time to market is a key factor. As you think about joins affecting design and build time, you can take two approaches - omit building the form/page that allows you to use it/maintain it, or deliberately denormalize the design in key places to help you gain time.

The key...is the key places.

Here's an example. As we arrive at a close to final design I typically see that most classes add a locationid and a corresponding location table - this is to indicate where each class will be held. In my particular case 90% of classes are held in Orlando and location isn't huge, but that could change, and I wouldn't argue it's bad design. The other thing I see that is most often missed is they treat the student as a login and capture demographic data for the student, but fail to capture what is for me key data - the employer at the time they attended the class.

Which of those is easier to fix later? Converting a location varchar(200) to a lookup table, or amending the design to capture employer per class attendee? In the latter case we may have lost data we can't recover, in the former we may have junked up data ('Orlando', 'Orlando, FL', etc). I'd argue that if you had to take a shortcut, do it on location. At worst we have to massage the data to convert it later, and even for reporting now it's not likely to cause huge problems.

To recap, there are two points here. One is that you have to understand a design with 100 tables is going to cost a lot more to build than one with 10 tables, up to you to move the slider. The other is that a really good architect/DBA can help decide which tables to remove when someone sets the slider to something besides 100 so that you can evolve the design with the least amount of pain. There's probably something to be said for creating the full 100 table design, then from that reducing it to the x table design for speed. Think of it as designing your dream house, then building it with 3 bedrooms instead of 6, you'll add more when time and funds permit.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating