Blog Post

T-SQL Tuesday No.72 - Is Your Data Model, Normal?

,

TSQL2Sday72Well I’m not

a data modeler, nor do I play one in the corporate world.  But it is TSQLTuesday No. 72, and I have my

tales to tell.  Mickey Stewe, (b|t),

aka @SQLMickey, database architect, whose focus is on data modeling, has

invited us all to talk on the topic of, well, no surprise, data modeling. Her

invite has been posted on her blog here. Who else but a data modeling

expert would know when things go wrong, and that is what we discuss today.  I have worked hands-on closely with

internationally renowned data architects in various places, and that is not an

understatement.  No names today.  I’ve certainly learned many things, but none

more so than what can go wrong with data modeling.

Data modeling

is a distinct (no pun intended) profession, not a core responsibility of a

typical DBA. I have had my hand at architecting small database structures, and

improving bigger ones, as performance lags. 

On a basic level, everyone should know about database normalization for

relational databases.  We know

normalization is the technique for organizing data in the database in order to minimize

data redundancy. This can optimize data retrieval operations, and ensure referential

integrity. The typical optimal design standards are the 1st, 2nd

and 3rd normal form.  Yes

there are more, but then again we’re talking about things that go wrong.  So here is a great primer of Data Modeling 101,

that everyone attempting to design a database should see. Quick primer, on data

modeling basics of the 3 forms of normalization. 

First Normal Form

  • Eliminate

    repeating groups in individual tables.

  • Create

    a separate table for each set of related data.

Second Normal Form

  • Create

    separate tables for sets of values that apply to multiple records.

  • Relate

    these tables with a foreign key.

Third Normal Form

  • Eliminate

    fields that do not depend on the key.

So, there are

times you want to actually de-normalize your

database, but I’ll leave that for the experts above my DM pay-grade.  However, often times, such as when developers

are designing the database model, don’t follow the simple rules, database chaos

ensues.  It might not be apparent at the

outset, but as time goes by, and your SQL Server starts to creak, and data

grows exponentially, only then do you realize, wow, this thing is not designed

to scale.

One of the

biggest faux pas I’ve seen over the years is when a 3rd party vendor

has a great app, but apparently they didn’t bother hiring a data modeler.  Many of these applications start out on other

platforms, such as Access, or Btrieve (remember that?), and such legacy

applications are migrated to SQL Server for the data backend.  This often results in dozens of tables

without any relations defined and no primary or foreign keys. Half the tables

are 50 columns wide and no indexes.  Most

of these tables have no understandable layout, no proper naming convention – ie

Table001, Table002, etc.  Datatypes are

fudged in migration utilities, and everything comes over as a varchar (max).  Oh, and by the way, you can’t place an index

on a varchar(max) column, so we should ideally limit its use.  This of course wastes space, and results in an

awfully shoddy database design that won’t scale, and of course, performance

will suffer, customers will complain, and businesses will be jeopardized.

Not too long

ago at a client, one to remain nameless to protect me from any lawsuit J , was chugging along on its SQL Server,

until logging in took > 10 minutes thru the front end, and updating or retrieving

records in excess of >10 minutes. 

That’s a life time for an end-client to wait, an absolutely unacceptable

for any business to survive.  I even

cringed at the prospect of coming close to touching it, but was answering the

call of duty and plea from the CEO.  We

made a lot of positive changes but a long way to go.  This database, as it were, could not have

possible been designed by a data modeler or architect.  I was told it was designed by one of the

premier database architects in the country. 

Well, probably one of the premier developers,

but again, not naming names. 

Even so, what

was created at the outset was changed and modified nteen times by revolving

group of developers, and certainly not the same pristine database design that

it started out to be.  So, we cannot

fault the original database architect here, it’s been completely bastardized

since the original design.  Therefore,

the biggest point in this discussion, is that not only must an awesome database

be perfectly designed, the data model must be maintained, and flexible to adapt

to changing requirements.  New tables, column

views, stored procedures, functions, indexes, etc. may need to be added, modified,

or dropped.  Therefore, the database

schema must be versioned, and the original model preserved.

My favorite

analogy to the aforementioned client is The

Nine Circles of Developer Hell, cleverly crafted by Kendra Little

of Brent Ozar Unlimited.  While the

client’s application is still a work in progress, we categorized the database

as being in the 6th circle of

code hell – code heresy.  We were able to bring it back from the 7th

circle - violent code, but there is long way thru the River Stix, to get

back to the 1st Circle, Code Limbo. There is a lot that can go

wrong.  Before I signed up to help this

client, the contract should have read “Abandon

Hope All Ye Who Enter Here

Adam

Machanic’s (b|t), one of the premier SQL

developers, started the T-SQL Tuesday blog party eons ago.  Well, back in 2009.  Here are the rules to T-SQL Tuesdaydom:

  1. Write a

    blog post about the topic. Don’t have a blog? Start one!

  2. Include

    the T-SQL Tuesday Logo and link it back to the original invitation post.

  3. Publish

    your blog post Tuesday, November 10, 2015, between 00:00 GMT & 23:59

    GMT.

  4. Leave a

    reply on the original invite with a link to your blog post (for the

    round-up).

  5. Share you

    post with the community! Tweet it out using the #tsql2sday hashtag

Thanks for

hosting Mickey!



And of

course, if anyone is interested in learning more about my book Healthy SQL – A

Comprehensive Guide to Healthy SQL Server Performance, published by Apress, you can go to the url:

                     http://bit.ly/orderHealthySQLnow

  You can also

get the book on Amazon: http://bit.ly/HealthySQLonAmazon

  For all

things SQL, news, events, jobs, info, and other fun tweets, follow me on

twitter @Pearlknows and join the #HealthySQL campaign to

keep your SQL Servers healthy!

 

 


 

 

 

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating