SQL 2005 Database Design

  • Hi all

    Before I start, let me just point out that I am new to proper database design work. At present I am a Analyst Programmer, but primarily focus on UI design and usability for web applications.

    However, recently I have been tasked with building a customer/contacts database which will later incorprate a ticketing type system to monitor customer quieries relating to Clinical Sessions.

    My questions at this point relate to data integrity. I have read numerous articles, bought various books and therefore understand the prinicples around normalisation and general design principles but am not sure of the actual way to implement these ideas.

    For example (this is just an example that I have been working on to make sure I understand what I am doing first), in the code below I have created date, created by user etc... with a row guid. I can understand this here as it keeps important records unique BUT in a table like "ContactAddress" where "Contact" and "Address" are separate tables by themselves are these extra 5 columns essential? Certain articles say yes, others say no. Which one is correct?

    -- =================================================================

    -- Create Schemas

    -- =================================================================

    CREATE SCHEMA [Contract]

    GO

    CREATE SCHEMA [Person]

    GO

    -- =================================================================

    -- Create Table: Person.Contact

    -- =================================================================

    CREATE TABLE [Person].[Contact] (

    [ContactID] [int] IDENTITY(1,1) NOT NULL CONSTRAINT [PK1_ContactID] PRIMARY KEY (ContactID),

    [ClientID] [int] NOT NULL CONSTRAINT [KF1_ClientID] FOREIGN KEY REFERENCES Contract.Client(ClientID),

    [JobTitle] [nvarchar](30) NULL,

    [Department] [nvarchar](30) NULL,

    [Salutation] [nvarchar](25) NULL,

    [ProfessionalTitle] [nvarchar](10) NULL,

    [FirstName] [nvarchar](50) NOT NULL,

    [MiddleName] [nvarchar](50) NULL,

    [LastName] [nvarchar](50) NOT NULL,

    [PublicNote] [varchar](255) NULL,

    [PrivateNote] [varchar](255) NULL,

    [IsActive] [bit] NOT NULL,

    [DateCreated] [datetime] NOT NULL DEFAULT (getdate()),

    [CreatedByUser] [int] NOT NULL,

    [DateModified] [datetime] NULL,

    [ModifiedByUser] [int] NULL,

    [RowGUID] [uniqueidentifier] ROWGUIDCOL NOT NULL DEFAULT (newid())

    )

    GO

    -- =================================================================

    -- Create Table: Person.ContactAddress

    -- =================================================================

    CREATE TABLE [Person].[ContactAddress] (

    [ContactAddressID] [int] IDENTITY(1,1) NOT NULL CONSTRAINT [PK1_ContactAddressID] PRIMARY KEY (ContactAddressID),

    [ContactID] [int] NOT NULL CONSTRAINT [FK1_ContactID] FOREIGN KEY REFERENCES Person.Contact(ContactID),

    [AddressID] [int] NOT NULL CONSTRAINT [FK2_AddressID] FOREIGN KEY REFERENCES Person.Address(AddressID),

    [AddressTypeID] [int] NOT NULL CONSTRAINT [FK3_AddressTypeID] FOREIGN KEY REFERENCES Person.AddressType(AddressTypeID),

    [DateCreated] [datetime] NOT NULL DEFAULT (getdate()),

    [CreatedByUser] [int] NOT NULL,

    [DateModified] [datetime] NULL,

    [ModifiedByUser] [int] NULL,

    [RowGUID] [uniqueidentifier] ROWGUIDCOL NOT NULL DEFAULT (newid())

    )

    GO

  • The first four of those columns you're talking about are there for auditing purposes and shouldn't have anything to do with relational integrity. The last column, the guid, I'm assuming is there for pessimistic concurrency checking. You would validate that the guid is the same as the one you read out prior to running an UPDATE or DELETE and then any statement that modifies the record modifies that value. Again, this is for concurrency, not referential integrity.

    Does that answer the question or just pose more?

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Hey Dave,

    Grant has a good point about not guaranteeing referential integrity. The extra columns you have are purely auditing columns. If you removed them, you wouldn't know who changed what, but they shouldn't matter to the base relation/table. If you are concerned about uniqueness, take a look at your table design. If you inserted the same row with the same exact data in it 100 times, does your table prevent this or allow this? If it allows it, but gives it a new ID every time, then you are not guaranteeing uniqueness. If you aren't referring to uniqueness, can you rephrase the question so that we better understand what you are asking?

    Thanks,

    Eric

  • Absolutely true.

    You also might want to look at having simply an Address table with a join table to associate addresses to contacts. That way when you get a married couple with eight kids, you don't get ten seperate records in your address table, all with the same address.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Hi guys

    Thanks very much, by just reading the forums I am learning alot.

    You have all answered my questions exactly.

    last question: Is having these columns on all tables a "best practice" design approach?

    Thanks again for the help, I really appreciate it.

    Dave

  • It depends (I love that answer). To me, this is a business definition. Strictly looking at data, as data, no, none of these columns help to define the Address object. However, if the business wants to know when a record was created or modified or they have a need to know who did what, when, where, then you add these fields. Do you add them everywhere? Again, it depends on the business model. Is it a best practice? Yes and no. If you need auditing information, this is one method. There are other methods all with more or less validity. It all depends on what is needed by your system. I would strongly recommend that if you have no indications one way or the other that this data is needed, find out. If it's not needed, don't add it. Maintaining data that isn't used is a bad practice, not a best practice.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Hi Grant

    Thanks again - in one swoop you answered a big question of mine with DB design.

    If the data is not needed and won't need to be maintained, it is not a best practice approach. I am also going to go back to the customer and try understand his requirements a bit more - Thanks again. ACES!

    Dave

  • Those fields represent your foothold in database management information. Knowing who created a record and who updated last has significant value. Just do it across the board.

    Handle filling in the dates and users in the default value of the table.

    getdate() there is an also an international date stamp

    User

    What I do is fill all four fields in when the record is created using defaults on the table.

    If the two data sets match -- no change has been made since it was created. We don't look at that in code but you could. Having it is better than not having it.

    Use a trigger to handle on "Update".

    Think Y2K. There is plenty of data storage space. Those four fields are the only clue you have to what your users (exactly who) are doing to your data.

    We hire clerks to do data entry. If someone is making an error we know who to talk to and retrain.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply