An Introduction to Database Models

  • Comments posted to this topic are about the content posted at http://www.sqlservercentral.com/columnists/fk

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • Interesting!

    I thought that hierachal databases were making a come back for very large databases because of their performance advantages.

    Are XML databases hierachal?

    Do you know if there are there any OO databases on the market at present?

     

  • I don't want to make this into any kind of flame war or anything, but I must address a couple of points here. I do not know what Frank Kalis thinks about the models described as the article is mainly a description of them, so these comments are not made to say he is wrong, they are written to address some points in the article that I think can be wrongly understood or accepted by some readers.

    As said in the article, the relational model clearly distinguishes the logical desgin from the physical. The relational model is all about the logical and nothing about the physical part. Therefore it is impossible to say that any other database management model has better or worse performance, because performance is based on the physical design solely. This is in fact the greatest advantage of the relational model, apart from it's theoretical foundation (which no other model has, or at least not as clear), that the decisions you make on logical design issues will not affect the physical design. It is up to the RDBMS to decide how the logical design you create is best handled physically.

    For the same reasons, it is also not correct to say that "[the] most significant [of] several weaknesses of the relational model ... is the inability to handle BLOB's." In the relational model, a relation has one or more attributes that are defined as being of specific domains. There is nothing that says that a domain can not be a BLOB or any other type of data. For a complete discussion and description of this see for instance the paper 'What first normal form really means' by C.J Date. Any flaw in handling these domains is purely a fault of the DBMS, not the relational model. Here is a nice quote from the above mentioned paper:

    A proper object/relational system is just a relational system with proper type support ... --which simply means it's a proper relational system, no more and no less.

    Regarding the OO 'model':

    "The developer of such a system is responsible for implementing methods and properties to handle the data in the database from within his object-oriented application. There is no longer a strict distinction between application and database."

    I would say that there is no longer a database, or at least not a DBMS. The reason to have a DBMS is precisely to have a centralized system for controlling the database, especially the integrity of it, so that the application developers do not need to (or must not even) know and control how to access the data. Using this approach we regress back to 40 years ago and leave all the work to each application, and we can only hope that each application takes care of what needs to be taken care of. Also, this means that each database you create is heavily linked to a specific application, if you want some other application to access the same data in just a slightly different way you probably need to either make changes to the first application or make the integrity constraints even more complex than what is already necessary using this 'model'. You'll notice that I put the word model in quotes when discussing the OO 'model'. This is because I do not think it is a complete data model, unlike the relational model. See for instance 'Something to call one's own'.

    Finally, I have to disagree with the conclusion that "there (still) is a world outside the relational model, which can be pretty exciting to discover". All of these other so-called models are either old ones that were replaced by the relational model because of the flaws they had, or regressions to these old models by re-inventing them and calling them something different. The relational model is based on set theory and predicate order logic, anything trying to replace the relational model must also be a replacement for these theories several centuries old and proven.

    With all this said, I still find this a good and interesting article as it brings up the discussion of points like these, and I kind of think that is one of the reasons why Frank wrote it.

  • As I explained in my other posting (in response to the article itself), any lack of performance is not a fault of the relational model but of the implementation of it (i.e. SQL). However, if we do compare SQL DBMSs to hierarchical DBMSs and do find that the latter generally has higher performance (I have no idea if this is true, and I highly doubt it), remember that this performance would come at a loss of integrity. And without the integrity of the database protected, what good is high performance? Also, once again the hierarchical model had several huge problems incl. the lack of separation between the logical and physical design. This leads to problems like the ones I described above regarding OO databases plus the fact that the user needs to know the physical access paths for the data, and is of course one of the reasons why the relational model replaced it 30 years ago. The same goes for XML which as you probably suspect is hierarchical, yes.

  • I would like to add a little bit of personal opionion on the OODB topic.

    The concept of an Object-Oriented Database just seems wacky  to me. Objects imply action. If you're thinking in terms of Object-Oriented programming, the really important stuff is the behavior of the objects. The data storage is irrelevant. In reality, a relational DBMS is the perfect data store for an OO system. There isn't anything the relational model can not handle for OO data storage. The only thing it can not handle is the behavior/methods/operations/pick-your-term. And by all rights it has no business performing actions on the data since that is clearly business logic.

    I worked with another DBA that did some considerable research on OODB's and his conclusion (Which I expected it to be in the first place) was that they are immature constructs for developers that are too lazy to build a code generation engine to develop the code between the data and the business logic. (Quick plig for my favorite code generator - CodeSmith)

    [font="Tahoma"]Bryant E. Byrd, BSSE MCDBA MCAD[/font]
    Business Intelligence Administrator
    MSBI Administration Blog

  • An article by C.J. Date on data models, what is a data model, and (more importantly?) what's not:

    http://www.dbdebunk.com/page/page/622923.htm

    --
    Adam Machanic
    whoisactive

  • There isn't anything the relational model can not handle for OO data storage. The only thing it can not handle is the behavior/methods/operations/pick-your-term. And by all rights it has no business performing actions on the data since that is clearly business logic.

    I'm very new to database-isms,  but I was making my database a very active one (ie, code the business rules into the database as triggers, stored procedures, and tasks; for example, when an alarm gets inserted into the db, a trigger fires off and if the alarm matches a mask in a different table, the database would email me using sqlmail).

    Your comment makes me wonder if that is really the way I should do it, so here are my newbie questions:

    1. Why separate the business logic into another compartment?  Where/how would it run?

    2. Is there a way to pass an event from the db to an application (db says "New Data in table Foo.", instead of the app pulling a dataset every few minutes and detecting changes -- is there a way to do this w/o pulling a whole dataset)?

    We're planning on having some events that run every minute, others that run daily, and some others (like the alarms above) that run based on row(s) modified in the table.

    Any reason my db should be quiet place of solitude vs a "Active Do-All Buzzing Database"?

    TIA,

    Thor

  • Triggers vs. Polling

    This is a trade-off decision.

    Is the trigger load greater than the load would be if you polled every few minutes? If the database is getting hammered then the trigger load is likely higher than would be expected for polling at intervals.

    How "real-time" do the alerts need to be? If you can wait five minutes or more then polling looks a little more attractive.

    What is the project schedule like?  Seriously though, if you can take time out for quality, you can theoretically eliminate both. I haven't doen this yet but I have been thinking about it. If your application is using .NET or a similarly capable platform (Java should be able to handle it along with many other OO platforms) you could write a data access layer using a single-call type methodology where there is only one set of the classes instantiated at a time and all calls to update the DB are handled through this set of classes. That way you can perform the update and send a notification via event and/or delegate to any other classes that are tied to it. It sounds pretty complex but once you worked out the details I bet you could come up with a code generator that would build the classes for you (See my earlier post).

     

    [font="Tahoma"]Bryant E. Byrd, BSSE MCDBA MCAD[/font]
    Business Intelligence Administrator
    MSBI Administration Blog

  • Thor:

    There is a fine line between "business rules" (e.g. constraints) and "business logic" (e.g. if user enters something in this field, pop up this message)... It really needs to be determined on a case-by-case and application-by-application basis.  Some apps are, by nature, much more data-driven than others.  However, on the whole I've seen much less dependency on the database for constraints than I think there should be.  One of the primary points of the database is the enforce data integrity, yet most shops prefer to do that all in the application layer and use the database only as a data store.  That, IMO, is a mistake.

    As for a way to pass an event from the db to an application, you can look into writing extended stored procedures.  Yet another part of the database that's under-used, IMO!

    --
    Adam Machanic
    whoisactive

  • Tatsu:

    Are you talking about using a queing system to serialize database access?

    --
    Adam Machanic
    whoisactive

  • Business Rules in the DB

    The only rules that should be in the DB are the ones that are used to enforce data integrity. Triggers are often used to enforce data integrity when the database was designed wrong in the first place. If you have a trigger to perform an action when an event occurs, that is business logic. If you are using a trigger to ensure that if a certain flag is set on a row then another table has to have certain rows in it then that is data integrity. A simple criteria for trigger necessity might be this: If I can get a valid report out of the database without the trigger to ensure that the data is good, then it is probably a business-rule trigger and probably should not be in the database.

    Just because the DB server can do a thing, it does not follow that you should make it do that thing.

    Serializing database access

    Yeah, I guess that is what I was thinking. You might be able to scale it out some if the data access components talked to each other too. This would probably be way too much overhead for a high transaction load though and you should probably just go with the polling solution. There are supposed to be a feature coming that will notify data sets when the data has changed I think but I could be way wrong on that.

    [font="Tahoma"]Bryant E. Byrd, BSSE MCDBA MCAD[/font]
    Business Intelligence Administrator
    MSBI Administration Blog

  • Serialization, IMO, defeats the purpose of having a DBMS and brings us back to flat-file land.  Part of the definition of a DBMS is that it's shared - many processes can talk to the DBMS at once.  Serializing in essence is the same as putting the database into single-user mode and then letting the app hit it.  How can that ever perform well?

    --
    Adam Machanic
    whoisactive

  • A simple criteria for trigger necessity might be this: If I can get a valid report out of the database without the trigger to ensure that the data is good, then it is probably a business-rule trigger and probably should not be in the database.

    Just because the DB server can do a thing, it does not follow that you should make it do that thing.

    Hrm.  I am using .NET, and most of the "compute intensive" and otherwise funky stuff (getting data from ancient systems) are running as a windows Service. -- This provides some validation data and alarms for older systems.

    More advanced systems email us their problems; I was planning to use SQLMail on a 5 minute interval to import the alarms into a table.

    With the questions of:

    1. Where will the object run (Service, Webserver/ice)?

    2. How will the object be notified that there are new tasks (polling? MSMQ? funky...) ?

    I thought it would be better to put the "Rule Matching" logic into triggers and stored procedures; these would either call a procedure (mainly for emailing reports), or run a program (using xp_cmdshell).  I will keep these "Event Triggers" separate from the "Integrity Triggers" (it looks like MSSQL can handle multiple trigger procedures... I suppose I'll have to figure out how to order them...)

    On a related note, does anyone have a suggestion for a reporting tool that can output plain text?  I can't seem to get that with either CR or MSReports...

    TIA,

    Thor

  • Thor:

    DataDynamics ActiveReports can.  I've used it quite extensively and IMO it's a pretty solid product.  Better automation capabilities than CR and very cheap too.  I haven't tried MS Reporting Services yet though, so I can't compare it.

    --
    Adam Machanic
    whoisactive

  • I would like to commend Mr. Hedgate on an insightful and articulate critique of this article. The two links in his response provide excellent resource for taking a step towards a more fundamental understanding of the Relation Model.

    I would urge those with an interest in the content to explore the rest of the dbdebunk site. I find Pascal's critiques of XML and Denormalization especially thought-provoking.

     

    TroyK

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply