To Parent/Child or Not To Parent/Child

  • I have a Bulletins table, where I want to indicate which Customers will see a particular bulletin. I don't expect to ever have more than 100 - 200 customers (each identified by smallint) and probably not even 50 in the foreseeable future.

    So my question is, do I simply add a bigish varchar column that'll hold a string concatenation of Cust ID's, and handle the parsing through code (either VB or T-sql). Or do I go through the trouble of creating a child table to associate CustIDs to Bulletin IDs? I know the parent/child strategy is the *correct* way to go, but it's also more maintanence, coding, etc. for something that I'm almost sure will not be needed (never will expand to the point where the # of customers is unmanageable in a string concat.)

    Thoughts?

    TIA - Kane

  • Doing it the right way (parent-child) has more benefits than just being "right". It will also query more easily, be easier to update, allow you to control for referential integrity, and so on.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • In theory that sounds fine, in practice not so much:

    Say I have Bull1 associated with Cust1, Cust2, Cust3, Cust4, Cust5. User now opens a UI screen with cust_checkboxes checked for these customers. User then unchecks Cust2 and Cust5. I need to reflect this change in the child table.

    So...

    DELETE FROM Bull_to_Cust_tbl WHERE BullID = Bull1

    for each checked value in cust_checkboxes

    INSERT INTO Bull_to_Cust_tbl ...

    end for

    That's alot of inserts versus a single insert to a main bulletin table with Cust# concatenated. Or is this a false issue - lotsa insert no big deal?

  • Plenty of ways to manage that in a child table.

    One option (pretty much the default) would be to delete the ones that were unchecked, and leave the rest of the rows alone.

    Another option, have an "Active" column in the child table, and the checkmarks simply update those directly, instead of deleting.

    Delete and replace (like you mentioned) is valid, but junks up the transaction log. Works, and probably won't create any problems, but it's "less elegant" than some of the other solutions. But that's just my prejudice showing.

    But that's just part of it.

    What happens when you want to delete a customer? In a parent-child multi-join (what you're talking about), you delete the rows for that customer ID from the child table, and you're done. In a "keep them in one column in the main table" solution, you first have to query every row in that table to find if that customer ID is in there, then you have to run an update on each of those rows. Can be done with a Replace command, but you had better be sure that deleting customer ID 3 doesn't accidentally change customer ID 13 to a second copy of customer ID 1 (removing the 3), and so on. One is simple (delete rows from a table based on a foreign key), the other is more complex.

    One version (child table) enforces (with a unique index, quite possibly the clustered index) that any given customer ID can only be associated with any given bulletin ID once. Strings don't have that kind of enforcement, and it would be easy to end up sending multiple copies of the same bulletin to the same person.

    And so on.

    It's not that the list-in-one-field version can't be done, or is some sort of world-destryoing cataclysm, it's that decades of study and testing and live implementations have found that violating First Normal Form almost universally causes more problems than it solves.

    If you want to, try it for a while. See for yourself. The kind of data you're dealing with, it probably won't cause enough problems to be a disaster. It'll end up being really annoying is all. But the experiment might be worth it just for your own edification.

    My first database was a mess. Lots of places where I had solutions very similar to what you're talking about. The way I learned what works and what doesn't was by trying things. I've violated conventional DBA wisdom many, many times. Sometimes it's worked out well, other times it's ended being something I had to pay for, but learned from.

    That's the difference between knowledge and information. Information is something you've studied, knowledge is something you've lived. Knowledge is better.

    So, I say it's not that big a deal in this case, try whichever you like, and if you find that it works, use it. If you find that it's problematic, switch to the other design. It's not that hard to normalize a single field into a sub-table. Requires rewriting some code, etc., but from what you're saying, I don't think that'll be a make/break issue in this case.

    I'd still go with the normalized version, and just play around with the denormalized one in a proof-of-concept environment. That's usually the better way to turn information into knowledge. But it's your data in your database, so it's your decision.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Excellent knowledge-sharing! Thanks for taking the time to help.

  • You're welcome.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Having just spent the last month cleaning up a DB which was very 1st normal running an ASP front end and some interesting T-SQL I would stringly encourage you to go normalised and make it parent child.

    With SQL 2005 it is so easy to do and if you are using SP to access / edit and modify your data anyway it is very easy to maintain.

    What you are suggesting with storing all the Cust ID's in a string will become a PITA very quickly when you start working with it.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply