Decoupling in Relational Databases

  • I agree with Gary's comments (and some of the others), as well. If this were the script to one of those 3 to 5 minute little movies that show people how to do something, people would be raving about how rapidly the verbal script got to the point. There are no spelling errors, it starts out by identifying a very common problem and the very well recognized solution, and is generally well written and easy to understand. This is an excellent introduction into "decoupling" and "bridge" tables (whatever you want to call either of those today ;-)) and I'm really surprised that more folks can't see the value of such simple and to-the-point articles especially as primers to newbies.

    I'll also state that one of the purposes for an article is to give people the chance to "cus'n'discuss" about such things so don't ever let that put you off.

    Well done, Tim. Keep up the good work and hope to see another primer such as this from you in the near future.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I had never heard of a 'bridge' table in my 30 years of relational database design and modeling.

    So I was surprised to see that the author, obviously a newcomer to relational databases, had not yet learned about normalization (illustrated painfully in his design with repeating groups), nor about intersection tables to resolve many-to-many table relationships.

    I do agree with one of the responders to this article that there's a "teachable moment" here, when he pointed out that the intersection table (aka 'bridge' table) should have a composite primary key of the two fk_keys, instead of a single surrogate key pk.

    Actually I have to throw a criticism here to our good friends at SqlServerCentral.com for allowing this article to be posted. I'm guessing the 90% of the readers of these articles are seasoned practitioners, but even the other 10% need to know the accepted tenets and nomenclature of relational databases. Quality control in order here?

    Marvin

  • marvin.elder (3/4/2010)


    Actually I have to throw a criticism here to our good friends at SqlServerCentral.com for allowing this article to be posted. I'm guessing the 90% of the readers of these articles are seasoned practitioners, but even the other 10% need to know the accepted tenets and nomenclature of relational databases. Quality control in order here?

    I'd have to disagree with Marvin here. All articles should be read in context of both their rating and linked discussion. You would NEVER get a article like this from a DBA. As a developer, I think Tim gave it the perspective akin to the one I wish some DBAs would give developers regading Data Access Layers. It is because these two pieces are often looked at in isolation that I have seen so many horrendous implementations.

    No to censorship. The freedom of this site gives it its valued diversity.

    FYI I have no connection to SSC (beyond user/membership), Red Gate and I have never played rugby with Steve (Union I am assuming young man!!!).

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • I might agree with Marvin's sentiment IF there were an accepted nomenclature for the concept addressed in the article. Yes, at a high level, it's just normalization. But when discussing the specifics of many to many relationships, there is no generally agreed upon name for them. "Decoupling" is maybe a bit unconventional, but actually makes quite a bit of sense; after all the process of normalization is really that of decoupling-or removing-functional dependencies in the data...

    Besides, as useful as SSC is, and has been over the years, anyone who has frequented it for any amount of time should realize that most of the articles are written by members of the community and as such, should not be considered "authoritative" which is not to say that they can't be useful or even quantifiably correct. Steve has always had more of an open door policy to contributions, which means that just about any lunatic can get published...;-)(speaking primarily of myself who used to be a fairly active author on SSC)

    /*****************

    If most people are not willing to see the difficulty, this is mainly because, consciously or unconsciously, they assume that it will be they who will settle these questions for the others, and because they are convinced of their own capacity to do this. -Friedrich August von Hayek

    *****************/

  • marvin.elder (3/4/2010)


    I had never heard of a 'bridge' table in my 30 years of relational database design and modeling.

    So I was surprised to see that the author, obviously a newcomer to relational databases, had not yet learned about normalization (illustrated painfully in his design with repeating groups), nor about intersection tables to resolve many-to-many table relationships.

    I do agree with one of the responders to this article that there's a "teachable moment" here, when he pointed out that the intersection table (aka 'bridge' table) should have a composite primary key of the two fk_keys, instead of a single surrogate key pk.

    Actually I have to throw a criticism here to our good friends at SqlServerCentral.com for allowing this article to be posted. I'm guessing the 90% of the readers of these articles are seasoned practitioners, but even the other 10% need to know the accepted tenets and nomenclature of relational databases. Quality control in order here?

    Marvin

    Not everyone who enters into the field of relational databases knows that they are, indeed, relational. Many are thrown to the SQL wolves by necessity or a manager with a bee in his/her bonnet (Hey, I heard of this thing called SQL, we bought a license, now use it!) and told to create a database. Even if they do know it is relational, how to implement the many-to-many relationships isn't always clear.

    SQLServerCentral caters to the newbie and the expert. Gotta have articles that are understandable by both. We've had this discussion before on a few threads as to whether or not Steve needs to change his editorial process and/or acceptance process for articles because some are not as technically correct as they should be, perhaps.

    Steve wants to encourage people to write articles, to use SSC as a place to cut their writing teeth on as it were. Those with any brains whatsoever will check the forums to see what the response to the post is. It is here that the community corrects itself, provides better solutions and hopefully improves overall. And, when there are items in the forum discussion about the article that the reader should know about, then Steve will mark the article with a note that it is best to read the discussion as well.

    I'm an SQL newbie myself. My job includes running ad hoc queries and reports from SQL and monitoring/debugging the processes on one server. I'm not into designing a database yet, although I have one on the side that I'm fiddling around with when I get a bit of spare time. I haven't done any formal training in SQL, just picked it up on the job. Articles like this can help me avoid mistakes like the one he made.

    Thanks for the article, Tim. 🙂

    -- Kit

  • Whoever said that many-to-many relationships don't exist in the real word is wrong. The truth is that the place where many-to-many relationships don't exist is in database designs. If they do you need to go back and add a many-to-many table to remove the many-to-many relationship.

    If you look at your database design and you see something like this (a many-to-many relationship):

    You need to get in a sudden state of alarm and change it to something like this (a many-to-many table):

    It's not necessary to get in a sudden state of alarm but if you do it adds to the drama. 🙂

  • marvin.elder (3/4/2010)


    I had never heard of a 'bridge' table in my 30 years of relational database design and modeling.

    That's pretty amazing in itself. How about a "relation table" or an "intermediate table"? Have you ever heard of an intersection table being called either of those? Or, how about a "junction table"?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • jacroberts (3/4/2010)


    It's not necessary to get in a sudden state of alarm but if you do it adds to the drama. 🙂

    I'm so stealing that....

  • timclaason (3/3/2010)


    And although the "SQL Ninjas" out there were quite offended by the simplicity of the article, along with non-standard terms I used (normalization versus decoupling, bridge tables versus link tables), I personally think that this article could be beneficial to people trying to grasp concepts that they may have missed in their "relational databases 101" class in school.

    Heh... not all of the "SQL Ninjas" are offended... at least not the practical ones. BWAA-HAA!!!... think about it. Some of the people making ad hominem attacks have supposedly been in the business for 30 years and haven't heard the term "bridge table". I pretty sure I wouldn't take offense to that. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/4/2010)


    timclaason (3/3/2010)


    And although the "SQL Ninjas" out there were quite offended by the simplicity of the article, along with non-standard terms I used (normalization versus decoupling, bridge tables versus link tables), I personally think that this article could be beneficial to people trying to grasp concepts that they may have missed in their "relational databases 101" class in school.

    Heh... not all of the "SQL Ninjas" are offended... at least not the practical ones. BWAA-HAA!!!... think about it. Some of the people making ad hominem attacks have supposedly been in the business for 30 years and haven't heard the term "bridge table". I pretty sure I wouldn't take offense to that. 😉

    I think I'm more of a SQL Orange Belt. I know a few of the moves, I can even apply some of them but essentially I have a dojo I can go to in order to improve!!!

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • Karen Lopez - InfoAdvisors (3/2/2010)


    Oddly enough, I'm getting lots of push back from developers these days that having such tables is "bad design". I'm trying to track down a source for this "knowledge". What devs are telling me is that the real world does not have many-to-many relationships and that these are just artifacts of a designer trying to push relational theory onto a non-relational world. I don't see that

    You should point out to your devs that relational theory is perfectly happy with one-many, many-one, and many-many relationships, so if they know how to do it without a many-many relationship they can perfectly well express it in a relational way without using many-many relationships - challenge them to try to express group membership in English without describing a many to many relationship - when they fail they may realise that this is not an artifact of the sort they imagine.

    What I enjoyed about this article is the "confession" part that showed how a bad design can get one into trouble. Articles should have such anti-patterns to show that there's a reason why something is a better solution.

    Yes, that was the best bit of the article.

    Tom

  • Roger L Reid (3/2/2010)


    A nice improvement on the original, Timothy. So many developers never get past the point you started at.

    May I recommend picking up "The Art of SQL" by Stephane Faroult? And perhaps "SQL and Relational Theory" by C J Date, and maybe Joe Celko's "Thinking In Sets".

    Most developers get into relational databases by accident and never get the background they need. You've obviously got the head for it, and I think you might find these books will consolidate your thinking on these matters.

    Roger Reid

    PS - it is true some of the terminology is off. Given that we are trying to do relational databases using SQL, I think we can cut each other a little slack on vocabulary. The C J Date book will clarify that!

    I usually think that anyone who advocates young and inexperienced database people reading Chris :crazy: :alien: Date should be severely punished for attempting to corrupt the young who are our hope for the future! Better to suggest they read everything they can find by Ted Codd, or Jim Gray, or Carlo Zaniolo, or Phil Bergstein, or indeed almost anyone (:hehe: whether they wrote about relational databases and relational theory or not)!

    Tom

  • Tom,

    So you're "down" on Chris Date? Don't you give him credit for authoring the most widely used textbook on relational database structure and SQL?

    Or is it that you liked his early work, but when he started publishing things like The Third Manifesto with Hugh Darwen he seemed to show that maybe his early success had gone to his head.

    Maybe you can enlighten me as to your thoughts on authors such as Selko, Jim Gray or maybe even Fabian Pascal?

    Marvin

  • marvin.elder (3/8/2010)


    Tom,

    So you're "down" on Chris Date? Don't you give him credit for authoring the most widely used textbook on relational database structure and SQL?

    Or is it that you liked his early work, but when he started publishing things like The Third Manifesto with Hugh Darwen he seemed to show that maybe his early success had gone to his head.

    Well, I was a bit tongue in cheek there and didn't expect to have to justify it, but now I've been called on it I guess I have to explain how I feel about him.

    I don't knowfor sure if the textbook is useful or not (but I believe that it is not a good textbook). When I looked at it (first edition, I think) ages ago I decided it was boring and pretentious and not worth my reading. Anyway my database background was what you might call semi-academic, mostly learnt from journal papers rather than from textbooks, so that may have biased me against the books. I gather from customer reviews at Amazon that the latest edition is incomprehensible to an ordinary mortal, but that's true of a lot of textbooks. Apparently he's now invented his own language to do the examples in and the latest edition has hardly any SQL in it; that's a pretty arrogant approach, I think, but of course both Knuth and Dijkstra invented their own languages for didactic purposes, so there are good precedents (there are also appallingly bad ones, eg Wirth's Pascal).

    I do feel that some of Date's teaching is extremely dangerous, and sets students back quite badly: it can take a long time to get someone to understand why NULL is wanted and needed in the real world after they've had a good dose of Date's anti-null lunacy. He claimed (or is reported as having claimed) to have disagreed with Codd about atomicity. In fact what he really disagreed about was which domains would generally be permissable as attribute domains, also he also announced his own set of principles for being relational which included a very direct statement of (Codd's) atomicity. So that I'm tempted to think that that was all a stunt to try to make himself look like the equal of Codd. Of course this messing about with the definition of atomicity is going to leave the average Joe more confused, not less confused, and it's a plain fact that so far as clarity versus vagueness there's nothing to choose between Codd's version and Date's (both are pretty awful).

    I don't have a down on him like Curt Monash does but like Monash I'm inclined to ask to what extent his textbook - which is clearly what everyone is using, because it has sold in enormous numbers - is responsible for the appalling standard of beginning DBAs and DB Developers.

    I don't like the new stuff much, there's a lot of armwaving, but to some extent there's a good approach to temporal data in it. I don't like his 6NF, but of course it's no more misguided than 4NF or 5NF and a lot less silly than DKNF (but I understand he believes those are good things too). Some of the things he did 30 years ago were quite good, but sometimes I get the feeling that some of his later work is theology instead of science or engineering.

    Maybe you can enlighten me as to your thoughts on authors such as Selko, Jim Gray or maybe even Fabian Pascal?

    I don't know any Selko. If you mean Joe Celko, I like what I've read of his. Don't agree with everything he says, but I would say that his pieces at http://www.dbazine.com/top-auth/top-auth-celko are far more useful to anyone who wants to understand current RDBMS than Date's textbook. Jim Gray did some very good work, wrote some very good papers, which were fun to read. Fabian Pascal wrote two books in the early 90s which I've been told are quite good (I know from experience that people who read them ended up far better at handling RDBMS than people who read Date's bestseller). It has been said that he subsequently grew into an intemperate idealogical jerk (the ideology is probably Date's influence on him) but I think it would be fairer to say he doesn't pull his punches than to call him a jerk (intemperate is correct though). He's good deal more readable than Date, but he seems to have written more anti-SQL (or anti DB2 or anti SQLS or anti Oracle etc) diatribes than scientific papers, and I guess that sort of diatribe is just naturally more readable.

    There's a lot of space between Pascal and Date. They both start from the position that IBM got it all wrong in the 70s, and that there are currently no relational databases in the world, and I guess I have to agree with them. Date seems to claim that he has "improved" on Codd's models to produce the perfect relational model and we should all do it his way (I think his model is poorer than Codd's). Pascal (usually) says that Codd had it about right and we should have implemented Codd's model (I agree); he also says that all our current database systems are a waste of space (there I don't agree with him) and that the way XML data is being introduced means they are getting worse not better (there I certainly agree with him - we have known for decades how to do hierarchic data, things like IDMSX were extremely good DBMSs, but that knowledge is not being used to help devise means of handling XML data in the RDBMS context).

    Tom

  • Tom, I like your style and agree with most of your assessments. NULLs indeed. But I won't even attempt to insinuate that I have anywhere near your understanding of all the works and authors mentioned. It is enough for me to learn and understand the principled ideas. I don't need to know who first found a way to express them best. Good principles existed long before any articulated mouthpiece came along to enunciate the wonders.

    As to the OP article, it is good to see progression. All of us start from birth and learn as we go. I have struggled with exactly how to teach the current run of developmental pontificates the principles of RDB design. I have worked in both arenas. I understood immediately the apples and oranges mix of the vocabulary used in the article. The DEV uses "decouple" because it is a term bandied about much in development, and has taken on a few meanings. I personally think the REST movement is what really muddied the term. It was for a time a catch phrase that had to be used in daily mutterings... but I don't desire to dive into that just now.

    To the point, I don't know of an academic school yet with a course that actually teaches RDBMS design with the effect of producing any significant amount of experienced DBA talent as a result. Academic courses spend more time discussing the history and competing views, rather than on actual principled design implementation. And they certainly don't focus on specific vendor implementation best practices.

    Most schools focus on development, and in that, on design patterns and academic problem solving methodologies, from a developmental standpoint.

    So, in my view, here I see another soul getting a grip on relationships, and being able to regurgitate the concepts (actually terminology be what it may). I smile and remember the days those first waves of understanding washed over me, and I finally could see a normalized structure all laid out in my minds eye.

    Being able to design relational tables, however, is only the start of a being a DBA. I come across many... too many... who think the role of a DBA is unnecessary. "The database is just a bucket to store things in..." and, "we prefer to decouple the application from the DB". Such are the refrains. That ilk all hits the fan the minute management wants any significant reporting and crunching to analyze the data, or the same data has to feed more than one purpose... (and it *always* feeds more than one purpose eventually).

Viewing 15 posts - 76 through 90 (of 98 total)

You must be logged in to reply to this topic. Login to reply