Table Defaults

Question

Table Defaults

Andy Warren

SSC Guru

Points: 119922
More actions
December 27, 2006 at 11:34 am

#114986

Comments posted here are about the content posted at http://www.sqlservercentral.com/columnists/awarren/2779.asp
Andy
Connect with me on LinkedIn

Viewing 15 posts - 1 through 15 (of 42 total)

You must be logged in to reply to this topic. Login to reply

DanKennedy Hall of Fame Points: 3736 More actions · Answer 1

Andy,

I think this may come down to personal preference and agreed standards. Perhaps a little like the naming convention discussions I'm sure some people will have much stronger views than others, however my 2 penneth worth:

- I like defaults for things like dates

- I can see how setting a default for a lookup value may initially look useful but projects that I have worked on previously have regularly seen the business rules develop to an extent where a single default value cannot be specified and it must be calculated based on other criteria. In this instance defaults must be moved back to the business layer/application

- Defaults, a little bit like triggers, are not overly "visible". By that I mean they are not obvious by looking at a table and it's columns. No biggy but it requires the person using the table to dig a little further. Similarly specifying the defaults in the stored procedures/tsql enforces this visibility but loses the benefits (and potentially breaks) the defaults specified.

Overall I'm probably caught between options 2 and 3 in your list and I think the realistic answer is that it will vary by organisation and project, however the most important thing, as always, is consistency. Get a consistent approach, certainly within a project, and preferably within an organisation and as long as all people working on that project adhere to it things won't go too far wrong.

Dan
www.firstcs.co.uk

Grant Fritchey SSC Guru Points: 399418 More actions · Answer 2

I have to agree. I'm stuck between option 2 & 3 (with occasional forays into 4). Since defaults aren't readily visible to the developers working on stored procs and since you can easily override defaults, enforcement is a bear. We've got well over 100 developers, the majority creating stored procs on various projects. There are seven of us checking on them. Getting them to use good joins, reference the covering index or clustered index, avoiding LIKE & NOT IN & UNION ALL & DISTINCT... This is where we get the biggest wins over all. It's a shame there wasn't a way (short of masking tables with views or something) to optionally disallow inserts to columns that have a default value specified.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

colin.Leversuch-Roberts SSC Guru Points: 52643 More actions · Answer 3

I generally dislike nullable columns, lots of nulls usually mean a poor design, however, sometimes you don't always have all the data - and lets not forget about apps that create new rows by inserting a blank row with the PK ( usually a guid ) and then update the newly created row.

So if you don't have all the data you need a value , maybe null or maybe a default, such as 'not set' or 'unknown' etc.etc.

I'm slightly confused about the overwriting of a default, a default is usually used to give a value where one is not provided at the time, for instance a blood group ( well I figure not all people know their blood group ) in a doctor's system. The defaults such as dates, userid or such - why would you overwrite them - your documentation and data dictionary should cover this and your test and release process catch any issues. ?

[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/

colin.Leversuch-Roberts SSC Guru Points: 52643 More actions · Answer 4

Sorry Andy - liked the article.

[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/

Andy Warren SSC Guru Points: 119922 More actions · Answer 5

Masking with views is a interesting approach to combine with defaults though unwieldy I guess, you need an 'insert' view and an everything else view. Colin, I'm kinda with you, except in practice not everyone has a data dictionary (or reads it). It is perhaps an interesting thing to scan for before applying a change, looking to see if they are overriding a default.

I appreciate the comments, wasn't sure how this topic would be received!

Andy
Connect with me on LinkedIn

Stephen Hirsch SSCommitted Points: 1822 More actions · Answer 6

The answer, as always, is "it depends". What are you trying to optimize for?

Andy Warren SSC Guru Points: 119922 More actions · Answer 7

For those interested, here is the link Joe mentioned: http://www.dbazine.com/ofinterest/oi-articles/celko35

Joe, it's an interesting technique and one I haven't applied at the db level - yet! I'll be looking for a place to try that out. But I'd be curious to hear more from you about pro/con of using defaults.

Andy
Connect with me on LinkedIn

Tatsu SSCertifiable Points: 7824 More actions · Answer 8

Andy, excellent article. I prefer to use simple text encoding schemes that are human readable for things like state. An example of alternative to your numeric encoding is:

U = Unverified

V = Verified

F = Unverified (Follow up sent)

X = Expired (Can be deleted)

The advantage to this is that users who are creating reports on the information will come to understand the encoding scheme easier and won't need to join in another table to get the description. Even when there are just a couple encodings on a table, adding the extra join can be a hassle.

Joe's article on transition constraints is also very good. Here is the link for everyone: http://www.dbazine.com/ofinterest/oi-articles/celko35/view.

[font="Tahoma"]Bryant E. Byrd, BSSE MCDBA MCAD[/font]
Business Intelligence Administrator
MSBI Administration Blog

Robert Davis One Orange Chip Points: 28027 More actions · Answer 9

The problem with using human readable codes such as a single letter is if one of the codes changes and it used in 10,000 million records, you have to choose between having the human readable code no longer match what it means or you have to update 10,000 records.

For example, using your example above, if management wanted to change Expired to say Deprecated, then either you have to update all of the data with a new code (like D = Deprecated) or you have to have a human readable code that does not immediately indicate its meaning (X = Deprecated).

I think the integer code is the absolute best way to go and if you want to have a human readable code, make it a computed column or create a view that includes that column. So your lookup table would be something like:

StatusID, StatusCode, StatusName

0, U, Unverified

1, V, Verified

2, F, Unverified (Follow up sent)

3, X, Expired (Can be deleted)

And your view would be something like:

Select D.*, S.StatusCode

From DataTable D with(nolock)

Inner Join Statuses S with(nolock) On D.StatusID = S.StatusID

My blog: SQL Soldier[/url]
SQL Server Best Practices: SQL Server Best Practices
Twitter: @SQLSoldier
My book: Pro SQL Server 2008 Mirroring[/url]
Microsoft Certified Master: SQL Server, Data Platform MVP
Database Engineer at BlueMountain Capital Management[/url]

Robert Davis One Orange Chip Points: 28027 More actions · Answer 10

I think it was an excellent aricle topic. I think defaults are not only underused but are often misunderstood.

In my opinion, when creating a table, for every column, the table designer (whether it be a dba or a web developeror whoever) should consider whether Null Values should be allowed or not. If not, then the designer whould conisder if a default value is appropriate.

Default values are not always appropriate. Consider that many, if not most, of us deal with data from external sources where we have no control over the input. Take Real Estate data for example. I was the dba for a company that did data services including public websites for some really big real estate companies.

Now, let's consider a common column you would find in any real estate database, Bedrooms. This is a case where you do not want a default value. Why? Because a value means something, and in real estate, saying somethign you don't mean can result in lawsuits. So an agent enters a listing in their local MLS (Multiple Listing Service) and does not enter any value for the bedroom count. Later, we iport the data into our database. If we gave the Bedrooms column a default of 0, then we would likely be misrepresenting that the property has 0 bedrooms when in fact, the agent just forgot to include that key information. Now the home owner is mad because nobody is looking at his house, the agent is mad because he or she never said that the house has 0 bedrooms, etc.

Maybe that was an overly long explanation. But the point is, you have to understand the data when it comes to default values and use them (or don't use them) wisely.

My blog: SQL Soldier[/url]
SQL Server Best Practices: SQL Server Best Practices
Twitter: @SQLSoldier
My book: Pro SQL Server 2008 Mirroring[/url]
Microsoft Certified Master: SQL Server, Data Platform MVP
Database Engineer at BlueMountain Capital Management[/url]

Philip Kelley SSCertifiable Points: 6351 More actions · Answer 11

I'll vote with the "it depends" party.

I've come to think of defaults as a sign of "lazy programmer syndrome" (LPS), for similar reasons as noggin. I prefer for the application (programmers) to explicitly state important information, particularly for important values such as object state -- is the user unverified, verified, etc.). If the app doesn't know the state to assign an item, then there's problems ahead. (Robert Davis spelled this out while I was typing this…)

Conversely, some values are best assigned by a central authority: the database. Having all "loaded at" date times set by the same server seems wise, given the number of times I've seen out-of-synch servers (three minutes isn't much, at least until it takes you two hours to figure out why something didn't work right...) So defaults on values like CURRENT_TIMESTAMP [dunno why, but I've always used that instead of getdate()], suser_sname(), and the like are fair game. Of course, there's nothing you can do if the app decides to directly write to that column, unless you want to clutter up the system with easily-forgotten triggers (yep, more wasted debugging time behind that opinion).

One thing you’ve overlooked: nulls work as defaults, and (IMHO) are an indicator of severe LPS. They do have their place, such as open-ended time periods and outer joins, but they should always be avoided unless you know precisely what a null value *means* in that row/column (such as "period ends at unspecified point in the future"). My personal mantra is "NULL is not your friend".

A last two cents: these days I'm undecided on lookup codes. Numerics (1,2,3,4) are traditional, fast, and expedient, but characters (U,V,F,X) can convey useful information quickly. I see two primary considerations, and I have not idea how they fall out:

- Character strings must (?) require extra overhead. When SQL compares "A" with "A", how many processor cycles are consumed for code page lookups, case-sensitive, Unicode considerations, or other ASCII ephemera?

- Which is faster, a lookup table or a CHECK constraint (of the form: [col] in [delimited list])? On small tables (probably never hit three pages of data) I've used up to ten character English status codes with check constraints. (But I wouldn't do this for tables that could have more than 100 rows, for the obvious space and performance reasons.)

Philip

Robert Davis One Orange Chip Points: 28027 More actions · Answer 12

I don't know what your problem is, but you might want to get a copy of How to Get Along with Other Humans.

Status itself can be an entity requiring an ID, description, and (if deemed appropriate) a human readable code.

Sorry if my sample code made up on the spot didn't meet the standards of ISO-11179 data element naming rules. Of course, I wasn't trying to demonstrate ISO-11179 data element naming rules.

If a StatusID and a StatusCode is what is appropriate for my data then that is what will be used regardless of how many ISO standards you have decided to live by.

My blog: SQL Soldier[/url]
SQL Server Best Practices: SQL Server Best Practices
Twitter: @SQLSoldier
My book: Pro SQL Server 2008 Mirroring[/url]
Microsoft Certified Master: SQL Server, Data Platform MVP
Database Engineer at BlueMountain Capital Management[/url]

Andy Warren SSC Guru Points: 119922 More actions · Answer 13

Robert, I salute you for the first sentence of your reply! I think some reply comment was merited without it devolving into flaming or worse and you managed it well.

Joe, I happen to disagree with your stance on identities. I find them useful and will probably continue to use them. Whether thats because Im more pragmatic, less idealic, or just dumber than you is hard (for me) to say. I think you've done a lot for our profession and could do more still if you'd have that identity conversation calmly and sedately, no matter how much it frustrates you that people like us don't get it. Eventually you'll find a way to make that argument more persuasive, or you'll come over to our side;-). How about we debate it at PASS next year?

Now let's get back to talking about defaults. Would anyone vote for a setting that would mark a field as being used for a default value only on insert (such as a date added column)? Does anyone have a standard policy in place about defaults?

Andy
Connect with me on LinkedIn

Robert Davis One Orange Chip Points: 28027 More actions · Answer 14

Thanks for the support Andy.

I would support the addition of a setting to a column to mark it as not updatable except by the default.

There are appropriate cases where a column should only be set by the default value. A date added column is a perfect example of that. It may be perfectly appropriate if the column could be marked as not updatable except by the default definition.

In the past, I have paired date added and date modified columns in tables when I only wanted to track these two values as far as auditing goes. In these cases, both fields would have a default of getdate() that would be set at insert. Then I use a trigger that sets the modified data to getdate() if an update occurs and the column was not already updated. This would allow me to over-ride the trigger by explictly updating the column. For example, if I needed to make a minor modification to records but I did not want the record to appear as having been updated, I would include "DateLastModified = DateLastModified" in the update query to maintain the original modified date.

My blog: SQL Soldier[/url]
SQL Server Best Practices: SQL Server Best Practices
Twitter: @SQLSoldier
My book: Pro SQL Server 2008 Mirroring[/url]
Microsoft Certified Master: SQL Server, Data Platform MVP
Database Engineer at BlueMountain Capital Management[/url]