Database Design Question

  • I have some developers that want a bunch of fields added to an existing table (it's fairly new) including a GUID field (UUID). My recommendation was that instead of adding a bunch of fields to the current table to create a new table with the fields they needed. And create a PK_FK relationship. They disagreed with this. My point was that I don't like the idea of adding random fields to accommodate a request. It's just messy and causes unneeded bloat. Any thoughts?

  • joshdbguy (12/21/2015)


    I have some developers that want a bunch of fields added to an existing table (it's fairly new) including a GUID field (UUID). My recommendation was that instead of adding a bunch of fields to the current table to create a new table with the fields they needed. And create a PK_FK relationship. They disagreed with this. My point was that I don't like the idea of adding random fields to accommodate a request. It's just messy and causes unneeded bloat. Any thoughts?

    How will creating the new table avoid bloat?

    Would this be a one-to-one relationship? In which case, it would be a PK-PK relationship. If the new 'random fields' are likely to be populated for only a small number of rows in the existing table, that is an argument for a new table.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • What do we mean by adding "random" columns? Do the new columns fall within the scope of the requirements, and is this essentially a 3NF table modeling design?

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Phil Parkin (12/21/2015)


    joshdbguy (12/21/2015)


    I have some developers that want a bunch of fields added to an existing table (it's fairly new) including a GUID field (UUID). My recommendation was that instead of adding a bunch of fields to the current table to create a new table with the fields they needed. And create a PK_FK relationship. They disagreed with this. My point was that I don't like the idea of adding random fields to accommodate a request. It's just messy and causes unneeded bloat. Any thoughts?

    How will creating the new table avoid bloat?

    Would this be a one-to-one relationship? In which case, it would be a PK-PK relationship. If the new 'random fields' are likely to be populated for only a small number of rows in the existing table, that is an argument for a new table.

    The new fields include things like Comments, status (in addition to another status field that already exists), and a UUID field. I would say about 70% of the records will use the new fields.

  • First, verify that they've done a proper design to assign the columns to this table. That is, do not assume they've done that, verify that they've done it. The vast majority of developers are notoriously poor at designing tables.

    If you do add the comments column to the existing table, decide if you want to force it out of row. This will allow you to leave it in the same table, but insure that it doesn't bloat the size of the base table columns so that too few of them fit into a page. This is especially valuable if the comments column will rarely be used.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • If the new columns are required by the business and there is a 1x1 normalized relationship between them and the original table, then add then to the original table. What is the argument against it? If downtime is expected, then just let them know it won't happen today or tomorrow but more like over the weekend.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • From your description of the situation I would add a few columns to a table. As described this would be a 1:1 relationship which is really kind of silly. Since you are worried about "bloat" wouldn't a new table cause even more of that since you will need to store more data now because of the relationship column? For things like comments I would agree it should be another table to accommodate more than 1 comment per parent but for a status? Spinning that another table just seems silly. Of course we don't know much detail at all since we don't know the current structure, what was requested to be added or any other details around this so we are really just guessing.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Let me guess... the table is 100 GB, there are twenty foreign key dependencies, and they want it deployed to production before Christmas? 😛

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell (12/21/2015)


    Let me guess... the table is 100 GB, there are twenty foreign key dependencies, and they want it deployed to production before Christmas? 😛

    You left out the 27 views, 11 procedures and 8 ITVFs that need to updated. Since the ITVFs are schema bound to the table, that won't matter, will it? 😉

    On the serious side, it depends on what the table looks like now and what you're adding to it. For example, if you're adding 12 different nvarchar(4000) columns to hold comments, then you may want to consider a side table with a 1:M. It depends on your requirements.

  • Sean Lange (12/21/2015)


    From your description of the situation I would add a few columns to a table. As described this would be a 1:1 relationship which is really kind of silly. Since you are worried about "bloat" wouldn't a new table cause even more of that since you will need to store more data now because of the relationship column? For things like comments I would agree it should be another table to accommodate more than 1 comment per parent but for a status? Spinning that another table just seems silly. Of course we don't know much detail at all since we don't know the current structure, what was requested to be added or any other details around this so we are really just guessing.

    Let me explain better. This table is basically being repurposed for something it wasn't originally intended for. Not only is the naming convention wrong but it's also confusing. On top of that we're getting requests to add more fields to this table and I'm sure this won't be the last request. What I don't want to happen is have a 1 million row table with 500k using one UID field, another 200k using a different UID field, and another 300K using none of the UID fields. On top of that, as we add more to this table I'm sure we'll be getting requests to add a bunch of random fields. This has happened before and we end up with one table that originally having 10 fields, having 50 fields and it being completely confusing as to why one record uses 15 and another one uses 35.

    Wouldn't it be better to have something like a UID table? So the main table joins based on a UID_FK to a UID table. The UID table contains the UID PK (ID), a type field (to denote the type of UID, and maybe a comment. That UID field also joins to a UID code field that contains the type information.

  • joshdbguy (12/21/2015)


    Sean Lange (12/21/2015)


    From your description of the situation I would add a few columns to a table. As described this would be a 1:1 relationship which is really kind of silly. Since you are worried about "bloat" wouldn't a new table cause even more of that since you will need to store more data now because of the relationship column? For things like comments I would agree it should be another table to accommodate more than 1 comment per parent but for a status? Spinning that another table just seems silly. Of course we don't know much detail at all since we don't know the current structure, what was requested to be added or any other details around this so we are really just guessing.

    Let me explain better. This table is basically being repurposed for something it wasn't originally intended for. Not only is the naming convention wrong but it's also confusing. On top of that we're getting requests to add more fields to this table and I'm sure this won't be the last request. What I don't want to happen is have a 1 million row table with 500k using one UID field, another 200k using a different UID field, and another 300K using none of the UID fields. On top of that, as we add more to this table I'm sure we'll be getting requests to add a bunch of random fields. This has happened before and we end up with one table that originally having 10 fields, having 50 fields and it being completely confusing as to why one record uses 15 and another one uses 35.

    Wouldn't it be better to have something like a UID table? So the main table joins based on a UID_FK to a UID table. The UID table contains the UID PK (ID), a type field (to denote the type of UID, and maybe a comment. That UID field also joins to a UID code field that contains the type information.

    Best is to do a normal, proper logical and physical design. Not try to jump ahead to just the physical. As part of that, you'll decide if you need to denormalize or not.

    But there are no real rules about a specific number of columns in a single table. It really depends on the data itself.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Maintain the principle of TNF as far as possible, comments belong in their own table, if an attribute belongs to the entity then add it to the table e.g. GUUI.

    If you have a change request process put such requests through that and make sure the reasons for the change, as well as the changes themselves, are well documented; I will guess these requests are coming from Marketing people who will forget the changes and simply ask for another column to do the same thing later on(Comments1, Comments2, Comment_Summary, Comment_Personal, etc.)!

    ...

  • I generally contain things like CRM Comments, Phone Numbers, and often times Status Codes in a separate table. They are not 1:1 static attributes in the same way that SSN, Birth Date, Hire Date are. For example, a customer can have multiple comments, multiple phone numbers, and status changes may need to be tracked as events over time.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • But regardless of the implementation, you need to let the developers know this deployment isn't going into production until after New Years at the very least. Has this been tested and approved by QA yet? Of course not, and they're out the office this week.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Comments may be added more than once, by different users in different times.

    Adding a new comment does not automatically mean elimination of a previous one.

    So, you're ending up having multiple comments for the same entity record:

    --> separate table, 1 to many relation

    Status is an aggregation of actions performed against an entity.

    Say, an item must be "Approved" to be submitted.

    What happened when you submit an approved item?

    Its status is changed to "Submitted".

    But does it mean it's not approved anymore?

    Well, you have wiped out the "Approved" status, so you cannot tell anymore.

    What if somebody has submitted the item without an appropriate approval?

    How can you tell?

    You cannot - database design is too bad.

    Therefore you should not have a "Status" column.

    ItemStatus should be a separate table with a log of actions performed against the item and the outcome of each of the actions. All the statuses go to the same table. If you need to check if the item was approved - look up for a latest record for the action against the ItemID and see what was the outcome.

    Conclusion: Statuses in a separate table, 1 to many relation.

    Apply similar logic to all other request - it will get you closer to a good DB design.

    _____________
    Code for TallyGenerator

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply