Table Defaults

,

This article stems from a interesting event on a project I was working on

recently. Overall it's about defaults, but as you'll see it's just as much about

design principles and communication.

I did the database design consisting of about 50 tables, driven by what I'd

consider to be fairly standard design goals:

  • Normalize the design (reasonably!)
  • Avoid nulls as much as possible
  • Date stamp rows where appropriate
  • Provide default values where they make sense
  • etc, etc

I also had a light data dictionary where I documented the purpose of each

table along with the column names and their purposes as well. For those that are

wondering, I built my dictionary by hand in Word. I have no real objection to

using the built in extended properties that can be used for the same purpose,

but I didn't have a tool handy to generate a doc from the properties and

maintaining the doc manually wasn't a lot of work for a project this size. 

I also documented some of the standards to be used and how some of the

business processes would work.

Now going back to the design principles, I hope most of you would agree with

me at least on normalizing and avoiding nulls. Date stamping (requiring you to

track at least the date the record was added to the table) isn't uncommon, but

in my experience tends to be under used in table design. I'm not trying to build

a full auditing solution as I will typically implement that with triggers once

the design has stabilized, I just like to know when rows were added. For the

tables where I felt a date stamp was needed I used a datetime data type

(arguably in some of these cases smalldatetime would have been sufficient, but

we'll save that for another day) and a default of GETUTCDate().

Note: After using getdate() for much of my career, in the past couple

years I've started to use getutcdate() instead. For those of you who may not

have used them, getdate() returns the local server time and getutcdate() returns

universal (GMT) time. If you're running a single server you can use either,

though getdate() is probably more comfortable since the time you see in your

queries will match the time on your watch. Once you start crossing time zones on

multiple servers you'll start to find getutcdate() more interesting.

I also listed 'provide default values where they make sense' as a design

goal. For example, in this project there is a business process where users start

out as unverified, may get moved to a status of unverified (follow up sent), and

hopefully to a status of verified. I prefer numbers over text for this kind of

situation, so I had a look up table where:

  • 0 = Unverified
  • 1 = Verified
  • 2 = Unverified (Follow up sent)
  • 3 = Expired (Can be deleted)

In the users table I set the default to zero. It's not that I couldn't or

wouldn't ever add a user directly with a status of verified, but in the standard

business flow I knew that users would start at zero and by setting it as a

default, I could save passing in one value to a stored procedure and also

indirectly enforce the workflow I had in mind by making sure users started at

the correct status.

I was fortunate enough to be able to outsource some of the early data access

work to a contractor I had worked with many times before and I provided him with

a copy of my docs. Some time later after that initial work was done I was doing

some testing and was not getting the expected results. Digging in to the problem

a bit I found that the test row I had added and expected to see was being

excluded due to a where clause built around the dateadded column. Looking a

little further I finally found the problem in the stored procedure used to add

the row, he had explicitly specified getdate()!

Talk about an "aha" moment. My first thought was that using GetDate() made

sense in his context. Most of us use it routinely and he had worked with me

before when I used it, and of course I had not documented that I was using UTC

for dates throughout. So my fault, and easy enough to fix.

But wait, there's more. Even if he had used GetUTCDate, to me it would have

been bad coding. The column had a default and it would never make sense outside

of some type of administrative action to enter a different value for the date

the record was added, so I would have just omitted the column from the insert

altogether. A middle of the road approach would have been to use the DEFAULT

keyword as the value for the date. The worst part is that if I changed the

default on the table, perhaps to reflect a change in the business approach, the

stored procedure would still be implementing the original behavior.

I think that we wind up with four possible philosophies:

  • Defaults are set for a reason, overriding is treason! Columns with

    default values should be excluded from all insert statements

  • Defaults are ok, you can include columns with defaults in your insert,

    but if you truly want the default value the correct way is to use the

    DEFAULT keyword

  • Defaults are evil. Most likely you would set the column as not null and

    force whoever is writing the insert to do the homework required to figure

    out what a good initial value is or where it should come from.

  • Chaos, where everyone does it however they like on the project!

Aside from avoiding chaos, I'm not sure which to recommend. I've used

defaults for a long time and am used to working that way, but is it helpful or

confusing behavior? Or is it just a matter of setting the standards up front so

there is no collision of philosophies? One downside to not using defaults is you

would no longer be able to do "insert into table default values" which I find

useful when generating a few rows for testing, but have never used in production

code. You could enforce defaults via an instead of trigger, but that seems like

a lot of work that you may regret at some point anyway.

Is it a Zen question, or is there really a right answer? I'm looking forward

to your comments so that we can figure it out together!

Rate

4.4 (5)

Share

Share

Rate

4.4 (5)