Table Defaults

  • Andy,

    I think this may come down to personal preference and agreed standards. Perhaps a little like the naming convention discussions I'm sure some people will have much stronger views than others, however my 2 penneth worth:

    - I like defaults for things like dates

    - I can see how setting a default for a lookup value may initially look useful but projects that I have worked on previously have regularly seen the business rules develop to an extent where a single default value cannot be specified and it must be calculated based on other criteria. In this instance defaults must be moved back to the business layer/application

    - Defaults, a little bit like triggers, are not overly "visible". By that I mean they are not obvious by looking at a table and it's columns. No biggy but it requires the person using the table to dig a little further. Similarly specifying the defaults in the stored procedures/tsql enforces this visibility but loses the benefits (and potentially breaks) the defaults specified.

    Overall I'm probably caught between options 2 and 3 in your list and I think the realistic answer is that it will vary by organisation and project, however the most important thing, as always, is consistency. Get a consistent approach, certainly within a project, and preferably within an organisation and as long as all people working on that project adhere to it things won't go too far wrong.

  • I have to agree. I'm stuck between option 2 & 3 (with occasional forays into 4). Since defaults aren't readily visible to the developers working on stored procs and since you can easily override defaults, enforcement is a bear. We've got well over 100 developers, the majority creating stored procs on various projects. There are seven of us checking on them. Getting them to use good joins, reference the covering index or clustered index, avoiding LIKE & NOT IN & UNION ALL & DISTINCT... This is where we get the biggest wins over all. It's a shame there wasn't a way (short of masking tables with views or something) to optionally disallow inserts to columns that have a default value specified.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • I generally dislike nullable columns, lots of nulls usually mean a poor design, however, sometimes you don't always have all the data - and lets not forget about apps that create new rows by inserting a blank row with the PK ( usually a guid )  and then update the newly created row.

    So if you don't have all the data you need a value , maybe null or maybe a default, such as 'not set' or 'unknown' etc.etc.

    I'm slightly confused about the overwriting of a default, a default is usually used to give a value where one is not provided at the time, for instance a blood group ( well I figure not all people know their blood group ) in a doctor's system.  The defaults such as dates, userid or such - why would you overwrite them - your documentation and data dictionary should cover this and your test and release process catch any issues. ?

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

  • Sorry Andy - liked the article.

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

  • Masking with views is a interesting approach to combine with defaults though unwieldy I guess, you need an 'insert' view and an everything else view. Colin, I'm kinda with you, except in practice not everyone has a data dictionary (or reads it). It is perhaps an interesting thing to scan for before applying a change, looking to see if they are overriding a default.

    I appreciate the comments, wasn't sure how this topic would be received!

  • The answer, as always, is "it depends". What are you trying to optimize for?

  • For those interested, here is the link Joe mentioned: http://www.dbazine.com/ofinterest/oi-articles/celko35

    Joe, it's an interesting technique and one I haven't applied at the db level - yet! I'll be looking for a place to try that out. But I'd be curious to hear more from you about pro/con of using defaults.

  • Andy, excellent article. I prefer to use simple text encoding schemes that are human readable for things like state. An example of alternative to your numeric encoding is:

    U = Unverified

    V = Verified

    F = Unverified (Follow up sent)

    X = Expired (Can be deleted)

    The advantage to this is that users who are creating reports on the information will come to understand the encoding scheme easier and won't need to join in another table to get the description. Even when there are just a couple encodings on a table, adding the extra join can be a hassle.

    Joe's article on transition constraints is also very good. Here is the link for everyone: http://www.dbazine.com/ofinterest/oi-articles/celko35/view.

    [font="Tahoma"]Bryant E. Byrd, BSSE MCDBA MCAD[/font]
    Business Intelligence Administrator
    MSBI Administration Blog

  • The problem with using human readable codes such as a single letter is if one of the codes changes and it used in 10,000 million records, you have to choose between having the human readable code no longer match what it means or you have to update 10,000 records.

    For example, using your example above, if management wanted to change Expired to say Deprecated, then either you have to update all of the data with a new code (like D = Deprecated) or you have to have a human readable code that does not immediately indicate its meaning (X = Deprecated).

    I think the integer code is the absolute best way to go and if you want to have a human readable code, make it a computed column or create a view that includes that column. So your lookup table would be something like:

    StatusID, StatusCode, StatusName

    0, U, Unverified

    1, V, Verified

    2, F, Unverified (Follow up sent)

    3, X, Expired (Can be deleted)

    And your view would be something like:

    Select D.*, S.StatusCode

    From DataTable D with(nolock)

    Inner Join Statuses S with(nolock) On D.StatusID = S.StatusID

     


    My blog: SQL Soldier[/url]
    SQL Server Best Practices:
    SQL Server Best Practices
    Twitter: @SQLSoldier
    My book: Pro SQL Server 2008 Mirroring[/url]
    Microsoft Certified Master: SQL Server, Data Platform MVP
    Database Engineer at BlueMountain Capital Management[/url]

  • I think it was an excellent aricle topic. I think defaults are not only underused but are often misunderstood.

    In my opinion, when creating a table, for every column, the table designer (whether it be a dba or a web developeror whoever) should consider whether Null Values should be allowed or not. If not, then the designer whould conisder if a default value is appropriate.

    Default values are not always appropriate. Consider that many, if not most, of us deal with data from external sources where we have no control over the input. Take Real Estate data for example. I was the dba for a company that did data services including public websites for some really big real estate companies.

    Now, let's consider a common column you would find in any real estate database, Bedrooms. This is a case where you do not want a default value. Why? Because a value means something, and in real estate, saying somethign you don't mean can result in lawsuits. So an agent enters a listing in their local MLS (Multiple Listing Service) and does not enter any value for the bedroom count. Later, we iport the data into our database. If we gave the Bedrooms column a default of 0, then we would likely be misrepresenting that the property has 0 bedrooms when in fact, the agent just forgot to include that key information. Now the home owner is mad because nobody is looking at his house, the agent is mad because he or she never said that the house has 0 bedrooms, etc.

    Maybe that was an overly long explanation. But the point is, you have to understand the data when it comes to default values and use them (or don't use them) wisely.


    My blog: SQL Soldier[/url]
    SQL Server Best Practices:
    SQL Server Best Practices
    Twitter: @SQLSoldier
    My book: Pro SQL Server 2008 Mirroring[/url]
    Microsoft Certified Master: SQL Server, Data Platform MVP
    Database Engineer at BlueMountain Capital Management[/url]

  • I'll vote with the "it depends" party.

    I've come to think of defaults as a sign of "lazy programmer syndrome" (LPS), for similar reasons as noggin. I prefer for the application (programmers) to explicitly state important information, particularly for important values such as object state -- is the user unverified, verified, etc.). If the app doesn't know the state to assign an item, then there's problems ahead. (Robert Davis spelled this out while I was typing this…)

    Conversely, some values are best assigned by a central authority: the database. Having all "loaded at" date times set by the same server seems wise, given the number of times I've seen out-of-synch servers (three minutes isn't much, at least until it takes you two hours to figure out why something didn't work right...) So defaults on values like CURRENT_TIMESTAMP [dunno why, but I've always used that instead of getdate()], suser_sname(), and the like are fair game. Of course, there's nothing you can do if the app decides to directly write to that column, unless you want to clutter up the system with easily-forgotten triggers (yep, more wasted debugging time behind that opinion).

    One thing you’ve overlooked: nulls work as defaults, and (IMHO) are an indicator of severe LPS. They do have their place, such as open-ended time periods and outer joins, but they should always be avoided unless you know precisely what a null value *means* in that row/column (such as "period ends at unspecified point in the future"). My personal mantra is "NULL is not your friend".

    A last two cents: these days I'm undecided on lookup codes. Numerics (1,2,3,4) are traditional, fast, and expedient, but characters (U,V,F,X) can convey useful information quickly. I see two primary considerations, and I have not idea how they fall out:

    - Character strings must (?) require extra overhead. When SQL compares "A" with "A", how many processor cycles are consumed for code page lookups, case-sensitive, Unicode considerations, or other ASCII ephemera?

    - Which is faster, a lookup table or a CHECK constraint (of the form: [col] in [delimited list])? On small tables (probably never hit three pages of data) I've used up to ten character English status codes with check constraints. (But I wouldn't do this for tables that could have more than 100 rows, for the obvious space and performance reasons.)

    Philip

  • I don't know what your problem is, but you might want to get a copy of How to Get Along with Other Humans.

    Status itself can be an entity requiring an ID, description, and (if deemed appropriate) a human readable code.

    Sorry if my sample code made up on the spot didn't meet the standards of ISO-11179 data element naming rules. Of course, I wasn't trying to demonstrate ISO-11179 data element naming rules.

    If a StatusID and a StatusCode is what is appropriate for my data then that is what will be used regardless of how many ISO standards you have decided to live by.


    My blog: SQL Soldier[/url]
    SQL Server Best Practices:
    SQL Server Best Practices
    Twitter: @SQLSoldier
    My book: Pro SQL Server 2008 Mirroring[/url]
    Microsoft Certified Master: SQL Server, Data Platform MVP
    Database Engineer at BlueMountain Capital Management[/url]

  • Robert, I salute you for the first sentence of your reply! I think some reply comment was merited without it devolving into flaming or worse and you managed it well.

    Joe, I happen to disagree with your stance on identities. I find them useful and will probably continue to use them. Whether thats because Im more pragmatic, less idealic, or just dumber than you is hard (for me) to say. I think you've done a lot for our profession and could do more still if you'd have that identity conversation calmly and sedately, no matter how much it frustrates you that people like us don't get it. Eventually you'll find a way to make that argument more persuasive, or you'll come over to our side;-). How about we debate it at PASS next year?

    Now let's get back to talking about defaults. Would anyone vote for a setting that would mark a field as being used for a default value only on insert (such as a date added column)? Does anyone have a standard policy in place about defaults?

     

     

  • Thanks for the support Andy.

    I would support the addition of a setting to a column to mark it as not updatable except by the default.

    There are appropriate cases where a column should only be set by the default value. A date added column is a perfect example of that. It may be perfectly appropriate if the column could be marked as not updatable except by the default definition.

    In the past, I have paired date added and date modified columns in tables when I only wanted to track these two values as far as auditing goes. In these cases, both fields would have a default of getdate() that would be set at insert. Then I use a trigger that sets the modified data to getdate() if an update occurs and the column was not already updated. This would allow me to over-ride the trigger by explictly updating the column. For example, if I needed to make a minor modification to records but I did not want the record to appear as having been updated, I would include "DateLastModified = DateLastModified" in the update query to maintain the original modified date.

     


    My blog: SQL Soldier[/url]
    SQL Server Best Practices:
    SQL Server Best Practices
    Twitter: @SQLSoldier
    My book: Pro SQL Server 2008 Mirroring[/url]
    Microsoft Certified Master: SQL Server, Data Platform MVP
    Database Engineer at BlueMountain Capital Management[/url]

Viewing 15 posts - 1 through 15 (of 41 total)

You must be logged in to reply to this topic. Login to reply