Database Design question regarding Normalized Tables

  • Hey all,

    Got a theoretical question for you guys. Let's say I have a table that contains information about a store. This information includes things like the telephone number of the store, the address of the store, the name of the store, etc ...

    To me, the design of the table would be such that all of these fields would be contained in a single table, as these are all 1:1 relationships - IE, it wouldn't make sense to me to create one table called StoreTelephone, another table called StoreAddress, etc ...

    Every time I've normalized table structures, I've only set up multiple tables in the case where I have 1:many or many:many relationships.

    However, I'm working with a table right now which is approaching 30 columns - way more than I normally work with. All of these columns, though, are 1:1 relationships with the main identity. So to me, it wouldn't serve any benefit to branch some of these off into other tables.

    Guess what I'm looking for is just some confirmation that what I'm designing isn't completely wrong for some reason or another. Any input is appreciated!

  • kramaswamy (2/2/2013)


    IE, it wouldn't make sense to me to create one table called StoreTelephone, another table called StoreAddress, etc ...

    Stores only ever have a single address and a single telephone number?

    However, I'm working with a table right now which is approaching 30 columns - way more than I normally work with. All of these columns, though, are 1:1 relationships with the main identity. So to me, it wouldn't serve any benefit to branch some of these off into other tables.

    If there's no benefit to moving some columns to another table, why are you considering moving them?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Stores may not have been the best example to use 😛 How bout, let's say, information about a client - the client's home address, name, etc...

    The only reason I'm considering splitting the table into smaller pieces is the question of whether having a table with a large amount of columns would cause any degree of performance degradation.

  • kramaswamy (2/2/2013)


    Stores may not have been the best example to use 😛 How bout, let's say, information about a client - the client's home address, name, etc...

    The only reason I'm considering splitting the table into smaller pieces is the question of whether having a table with a large amount of columns would cause any degree of performance degradation.

    Contact information is even worse than store information. A client can have more than one home address (summer home, winter home, work week home, weekend home, vacation home, multiple shipping addresses, etc), can certainly have more than one phone number, can actually have more than one name in the form of "aliases" (AKA), and much more (how many email addresses can one person have, for example). Most contacts I have have at least 3 contact telephone numbers (home, work, cell) and at least 2 email addresses (semi-private, work).

    I wouldn't count on some of the things you've identified as being only 1:1. I also try to avoid NULLs especially for fixed length datatypes.

    So, like everything else, "It Depends". If you've only ended up with 1:1 relationships in your normalization attempts, one of two things may have happened... either the data you used never did go beyond 1:1 or you didn't normalize as correctly as you may have thought.

    I can give you an example guaranteed to produce a 1:Many relationship... InvoiceHeader and InvoiceDetail. Another would be AccountInfo and AccountTransaction. For contacts, Contact and ContactTelephone.

    The other thing you have to look for is 1:0 information. If a column or set of columns is frequently NULL, that's a pretty good indiction that it should be in a separate table.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Considering something like Contact Info I'd do something like below. I'm not including every possible attribute, just trying to give some ideas. I can't tell you how many wasted and expensive hours I've spent on parsing names and addresses just to mention two categories of contact info. Generally speaking I'm inclined toward more granularity when designing a database structure. Those who just dabble with writing simple queries seem to prefer the spreadsheet model you've described because they don't have to worry about doing complex joins, where clauses, grouping, etc. I've been told before to "just put it all into one table" and of course before the project goes live the same person who insisted no one--no one!--cares about fax numbers anymore makes a last-minute request to add fax numbers.

    I'll say again: this is a very simplified model. What actually needs to go in the various tables is always based on the needs of the application.

    Table:Contact

    ContactID --(could by a natural key like CustomerNumber or SSN)

    ContactType

    Prefix -- the following 5 categories are a minimum when entering names

    FirstName

    MiddleName

    LastName

    Suffix

    MaidenName --or any other UNIQUE traits like:

    Sex

    Birthdate

    --What needs to be in an address table varies country to country.

    --In the USA it's best to get the USPS address guide and enter

    --addresses in the proper format to start with.

    Table:Address -- allows any number of addresses per contact

    AddressID

    AddressType

    AddressDesc --optional descriptive name such as "My Office" or "Spouse's Mom"

    AddressSortOrder

    AddressNumber

    AddressPrefix --S, East, NW, etc

    StreetName

    AddressSuffix

    City

    State

    PostalCode

    Country

    --Just like addresses

    Table:Phone

    PhoneID

    PhoneType

    PhoneDesc

    PhoneIntlCode

    PhonePrefix

    PhoneAreaCode

    PhoneNumber

    PhoneExtension

    --Best to present the end-user a list to choose from or you wind up with a million variations.

    --All of these tables allow for adding a new 'type' at any time or changing the description of a 'type'

    Table:AddressType

    AddressTypeID

    AddressType -- shipping, mailing, home, office, vacation house, etc

    AddressTypeDesc --optional for allowing longer descriptions

    --Could be Customer, Agent, Relative, etc

    Table:ContactType

    ContactTypeID

    ContactType

    ContactTypeDesc

    Table:PhoneType

    PhoneTypeID

    PhoneType --home, cell, office, fax, etc

    PhoneTypeDesc

    --I like to use a cross-reference table because you can sometimes have more than 1 contact at the same address

    --as well as multiple addresses per contact

    Table:ContactAddress

    ContactID

    AddressID

    --For the same reason as addresses

    Table:ContactPhone

    ContactID

    PhoneID

    --It's possible to have other attributes for a contact that are one-to-many;

    --sometimes an EAV model is used in such a case but that opens up Pandora's Box

    Table:ContactChildren

    ContactID

    ChildFirstName (Middle. Last, etc)

    ChildRelationship (etc.)

    Table:ContactAssociations

    ContactID

    AssociationName (etc)

    Table:ContactHistory

    ContactID

    ContactDate

    ContactDescription

    ContactPurpose --which would relate to a ContactPurpose table

    EmployeeID --which might relate to another set of tables for sales people, support, canvassers, etc

    ContactTranscript (etc)

    --These last 3 tables or tables like them could be done using an EAV model;

    --this would allow limitless contact profile categories but can be a beast

    --when trying to write efficient queries. Very flexible though if the category

    --types and hierarchy cannot be easily predicted.

    Table:ProfilePropertyDefinition

    DefinitionID

    DefinitionParentID --so a hierarchy could be built

    DefinitionLevel --again for hierarchical use

    DefinitionType --Children, Associations, History, etc.

    DefinitionDesc (etc)

    Table:ContactProfile

    ContactID

    DefinitionID

    ProfileValue

  • Most of the databases I work with are for vendor supplied systems, but it is not uncommon to see some 1:1 relationships between tables, particularly if a single table might contain a large number of columns.

    Seems to me that I have read an article from Kimberly Tripp at SQLSkills.com that discusses the performance impact of records > 8k that would cause excessive page splits and might be candidates for separation into 1:1 tables. I'm sorry, other than the general idea of the piece, I can't give you more details. Most of the inhouse DB's I've designed have been for relatively small systems and I've never had to break out columns into 1:1 tables.

  • First step is to be certain about whther the relationship is 1:1 or not.

    If you are certain the relationship is 1:1 and you are wondering about the benefits of splitting them into another table, the name for this is 'vertical partitioning'. If you do some research on vertical partitioning you may find some ideas to help you.

    If you are not certain the relationship is 1:1.... then do some anlysis and business process modelling to work this out for certain.

  • Steven Willis (2/3/2013)


    Considering something like Contact Info I'd do something like below...

    Where's the +1 button. Ton of good info and examples Steve.

    --------------------
    Colt 45 - the original point and click interface

  • kramaswamy (2/2/2013)

    The only reason I'm considering splitting the table into smaller pieces is the question of whether having a table with a large amount of columns would cause any degree of performance degradation.

    Depending what you're putting into those columns you might possibly run into issues with the maximum 8000-odd byte length a SQL row can be (ignoring out-of-row data like VARCHAR(MAX) etc). There's also the simple issue of readability--when you come back to this table in six months and have forgotten what columns are in it, how easy is it going to be to scan through those 30+ columns to find the information you're looking for?

    (I know what I speak of, on both counts--I'm often forced to delve around in a database that has a table containing in excess of 200 columns and with a maximum theoretical row length of more than 20k, which causes all sorts of fun!).

  • So in your model, if I got a new cell phone and number , you would need to create a new entry for me. Do you know for a fact that this possibility doesn't exist? I don't see the disadvantage of normalising the database in contrast to the risk of defeating the purpose of a rdms.

    ----------------------------------------------------

  • mmartin1 (2/15/2013)


    So in your model, if I got a new cell phone and number , you would need to create a new entry for me. Do you know for a fact that this possibility doesn't exist? I don't see the disadvantage of normalising the database in contrast to the risk of defeating the purpose of a rdms.

    @mmartin1 which model are you referring to?

    --------------------
    Colt 45 - the original point and click interface

  • I refer to the 1:1 between an entity and a phone number. The OP refers to a store. If I wanted to reach a certain number of the store that goes directly to a manager, dialing direct and waiting on hold a few minutes (example) would be less ideal. This design can only capture one number.

    ----------------------------------------------------

  • Steven Willis (2/3/2013)


    Considering something like Contact Info I'd do something like below. . . . .

    I'll say again: this is a very simplified model. What actually needs to go in the various tables is always based on the needs of the application.

    . . .

    --All of these tables allow for adding a new 'type' at any time or changing the description of a 'type'

    Table:AddressType

    AddressTypeID

    AddressType -- shipping, mailing, home, office, vacation house, etc

    AddressTypeDesc --optional for allowing longer descriptions

    --Could be Customer, Agent, Relative, etc

    Table:ContactType

    ContactTypeID

    ContactType

    ContactTypeDesc

    Table:PhoneType

    PhoneTypeID

    PhoneType --home, cell, office, fax, etc

    PhoneTypeDesc

    . . . .

    --I like to use a cross-reference table because you can sometimes have more than 1 contact at the same address

    --as well as multiple addresses per contact

    Table:ContactAddress

    ContactID

    AddressID

    --For the same reason as addresses

    Table:ContactPhone

    ContactID

    PhoneID

    Lots of great stuff here. I would suggest that what's called a "cross-reference" table is really an "associative" table to relate entities (client to address, client to phone). Another good value of associative tables, besides allowing multiple clients (family members, for instance) to share an address or phone, is to define the nature of the relationship itself. So, for the Client-To-Address association, you'd have a column to distinguish perhaps Home from Work or Mail. This may, in fact, be what was intended by the various "Type" columns; if an address is shared by two clients, it could be home for one but just mail for another (think college student and parent, perhaps), so you'd want the type specified in the association.

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply