Voluntary Constraint?

  • Comments posted to this topic are about the item Voluntary Constraint?

    Best wishes,
    Phil Factor

  • Thank you, thank you, thank you, Phil.

    The question isn't so much about whether constraints are a good idea, it's more about understanding the psychology of developers who don't want to know that they're a good idea. It seems to me, that in the sphere of application development, it is too easy for bad database design to be implemented. But why?

    I suspect I'm not the only one who has to deal with software developed by (usually highly intelligent) people who are NOT developers by trade, but who:

    :arrow:are very good at what they do in their particular profession of choice

    :arrow:enjoy dabbling in software development as a kind of hobby

    :arrow:can see where some quick wins could be gained by a "simple" bit of software

    ➡ because they are intelligent and good at what they're trained to do, develop an arrogance that there's nothing they could possibly learn from database designers. "All that guff will just slow us down and cost a fortune" ...which is true in a very narrow sense.

    What happens is that hobby software ends up in production, and is on the face of it, much better than what was there before (paper processes for example). Once it's in production though, removing it eventually becomes akin to cutting out a cancer. Things that worked OK for a few thousand rows eventually start to disintegrate, but only after hundreds or thousands of business dependencies have been built up around them.

    I deal with databases from around 20 different software vendors. The seven most problematic (i.e. over 30%) followed the above path on the way to becoming commercial software. The sheer audacity and arrogance of some hobby developers ("I know the business, therefore I know every aspect of the software requirements") is a large part of their commercial success. As sales people selling to people in their own professions, their cases can be compelling. I honestly don't see a cure.

    I was sitting in a meeting with a vendor once and we were attempting to discuss the manifold data integrity problems we were having because of the non-existence of constraints. The vendor simply closed the discussion down by saying "It's not the software's fault that the data entry staff are idiots." And that was that.

    ...One of the symptoms of an approaching nervous breakdown is the belief that ones work is terribly important.... Bertrand Russell

  • Also well said Old Hand.

    One of the conundrums I often observe in the hobbyist intellectual is that even after the first sign of dark clouds on the horizon they don't seem inclined to join the dots! I think in many cases the ego that created the success (often by force of effort) is not receptive to a retrospective analysis.

    I guess in many cases where success in the venture occurs, that a cohort of underlings has to be engaged to scale up the solution - and it is these unfortunates who have to wrestle with the monster that has been created. In many cases this backfill cohort is younger (cheaper), impressionable, dis empowered and not at all necessarily talented - all of which leads to a perpetuation of the original "dodgy" practices under the "ain't broke" mantra.

    In a personal experience where I have sneaked a peek under the hood of a second time success story (the first having been very successful as well) I was dismayed to see an appalling database implementation - ably masked by a glamorous user interface - but in reality simply a brute force effort built on foundations of pure sand.

  • I certainly am not arguing with Phil, or the two posters. Maybe I'm just frustrated with some deadlock investigations where neither transaction was doing anything wrong, and if both had asked a few milliseconds apart, both would have succeeded, but instead one of them died in deadlock. Then tracing through the constraints, trying to figure out exactly how it happened and coming to the conclusion there isn't any way to stop it from happening.

    Then sometimes, you can solve the problem by setting up locking hints to cause locking in the right order. Just, why do you have to jump through all these hoops, to solve a problem produced by SQL logic?

  • @ken Lee

    Put your question to ASK.sqlServerCentral.com, or one of the forums, and see if one of the experts can help with this problem. I agree that, when faced with a frustration like this, one's instinct is to strip off as much complexity as possible to get to the root of the problem, but I've never known constraints to be the guilty party of a deadlock problem, but I'd rather one of the specialists dived in to help you. See Brad's 'How to track down Deadlocks using SQL Server POrofiler'[/url] Also check out Jonathan Kehayias and Ted Krueger's book Troubleshooting SQL Server: A Guide for the Accidental DBA[/url]

    Best wishes,
    Phil Factor

  • The article sounded a bit like a rant...

    ... I only wish I'd thought of it first! 🙂

    Well said, Phil.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Agreed. Only recently have I felt strongly about designing in constraints. I now see an inverse relationship between the number of constraints versus post implementation support issues.

  • Ken Lee-263418 (1/1/2012)


    ... Maybe I'm just frustrated with some deadlock investigations where neither transaction was doing anything wrong, and if both had asked a few milliseconds apart, both would have succeeded, but instead one of them died in deadlock. Then tracing through the constraints, trying to figure out exactly how it happened and coming to the conclusion there isn't any way to stop it from happening.

    Then sometimes, you can solve the problem by setting up locking hints to cause locking in the right order. Just, why do you have to jump through all these hoops, to solve a problem produced by SQL logic?

    I have been here too (with SQL Server as it happens) and the reason is not IMO (1) SQL logic or (2) embedded constraints. In my case it was perverse behaviour of the implementation - in particular the query optimiser recycling query plans in a case where 2 instances of the query with different sargs had run at the same time (previously) and produced inverse (of each other) lock escalation strategies ... then voila a deadlock when reused at a later date!

    What made it even more perverse was "in testing" (ie. without 20K sessions seen in production) the 2nd "bad" query plan never got created (and therefore not left in cache) so showplan just played out sweetness to the unskilled reader. Unfortunately the vendor was not much interested and there was not much room to move on a 2 table join.

  • If a developer argues that these constraints are not necessary because the code manages the integrity, it can be argued that after implementing these constraints in the database there will never be a violation of the database constraint. So what's the problem with having them in the database. Furthermore, can it be guaranteed that all access to the database will be through the application? Perhaps. It can be guaranteed that all access to the database will be through the database.

    On the topic of purchasing products with their proprietary databases, perhaps if enough data architects were involved in the purchase of these products and if enough of us said no to those which do not adhere to these constraints, the software vendors would have to build them in or not sell their products. Those are a couple of pretty hefty ifs, but worth pursuing.

  • And then there's me, who's a DB Developer and not a front-end guy, and I build the back end architecture.

    Why don't I usually use constraints except in "absolute, must have" scenarios?

    Because I'm a lazy git and I don't want to do a schema deployment when you change your mind in a week. Besides, this way the Constraint illustrates the absolute necessity of that data in that particular system, rather than being littered with them everytime you need to make a length change.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • I've never yet seen an application correctly validate all the data being fed in through it. Most? Sure. All? No way.

    Just ran into a situation where "the data format is controlled by the front end" was the policy. But someone using a browser with Java scripting turned off managed to bypass all the data validation and put junk data that violated almost every business rule in the place.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Just noticed your by-line ...for "Row-By-Agonizing-Row"

    Don't you just "curse OR" rant when that happens?:-)

  • From a SQL developer's perspective it is tempting to assume that data integrity should always be enforced by the database. Unfortunately SQL Server (like most SQL DBMSs) is only really effective for enforcing a few very simple data integrity rules - for example those that can be expressed as FOREIGN KEY constraints or "one row at a time" CHECK constraints. Other types of rules either aren't supported (e.g. mandatory cardinality "at least N" constraints) or require procedural code rather than being part of the data model (inclusion dependency and join dependency constraints).

    Relying on procedural Transact SQL to enforce complex rules or even simple rules has significant disadvantages. It's a basic tenet of business rules management that rules should be specified declaratively - something that just isn't achievable in SQL. Additionally: T-SQL doesn't integrate easily with other components outside the database; it doesn't support code reuse to the extent of other languages like C# or Java; it is weakly-typed; it can't easily be scaled out across multiple servers; it lacks support by developer tools (e.g. for modelling, static analysis, testing and instrumentation).

    To simplify development, maintenance and support it is therefore quite common for business rules to be centralised in a domain layer or in rules engines and data integration / data quality tools outside the database. That way rules can easily be shared, maintained and inspected by applications, tools and the people who support them. So the DBMS isn't always the preferred place to implement data integrity - but it's certainly a better place than embedding integrity rules separately into every application.

  • sqlvogel (1/4/2012)


    From a SQL developer's perspective it is tempting to assume that data integrity should always be enforced by the database. Unfortunately SQL Server (like most SQL DBMSs) is only really effective for enforcing a few very simple data integrity rules - for example those that can be expressed as FOREIGN KEY constraints or "one row at a time" CHECK constraints. Other types of rules either aren't supported (e.g. mandatory cardinality "at least N" constraints) or require procedural code rather than being part of the data model (inclusion dependency and join dependency constraints).

    Relying on procedural Transact SQL to enforce complex rules or even simple rules has significant disadvantages. It's a basic tenet of business rules management that rules should be specified declaratively - something that just isn't achievable in SQL. Additionally: T-SQL doesn't integrate easily with other components outside the database; it doesn't support code reuse to the extent of other languages like C# or Java; it is weakly-typed; it can't easily be scaled out across multiple servers; it lacks support by developer tools (e.g. for modelling, static analysis, testing and instrumentation).

    To simplify development, maintenance and support it is therefore quite common for business rules to be centralised in a domain layer or in rules engines and data integration / data quality tools outside the database. That way rules can easily be shared, maintained and inspected by applications, tools and the people who support them. So the DBMS isn't always the preferred place to implement data integrity - but it's certainly a better place than embedding integrity rules separately into every application.

    The problem with that is that I've never yet seen it work. It seems to be in the category of "sounds great on paper, but it doesn't actually fly".

    My question on philosophies like that is always, "why bother with a fire extinguisher in the kitchen when you can call 911 and get a truck there to put the fire out?" In other words, have a data rules layer, on top of or integrated with the data access layer, but also enforce the rules at other levels, including the database.

    Yes, there are rules that the database can't readily enforce. Constraints across tables is one of the more obvious ones, where the only answer (triggers) is often worse than the problem it's meant to prevent. And there are rules you want to enforce WAY before the data ever gets sent to a table. You really don't want an SQL error if someone formats a date incorrectly in a form on a website. That should be handled much earlier in the process, even if only to avoid trips to and from the database for performance reasons.

    But there are also rules that really do need to be prevented in the database layer, even if "just in case", like a fire extinguisher. If you don't have a fire extinguisher in your kitchen, you've basically decided that your home burning down because of a grease-spill on the stove is an acceptable outcome. Not having data protection in the database is the same kind of thing.

    An eggregious example I ran into a couple of years ago was a company with a firm policy of never deleting any data or updating any data. They had a bit column in each table named "Active", which went along with a pseudo-ID. For each pseudo-ID, there was only ever supposed to be one row marked active, and all the rest were to have a 0 in that column. For example, customer 123 might have 20 or 30 rows in the Customers table, but only one of those rows would have a 1 in the Active column, the rest would have 0. Deletions would mark all the associated rows with a 0, and updates would insert a new row, with the new data, mark it active and mark the old row inactive. Sounds great on paper. They had all the rules for it built into a business rules layer, to make enforcement more efficient. After a year of it being in use, they were totally certain it was working perfectly until I did a simple Select ... Where Active = 1 Group By pseudo-ID Having Count(*) > 1, and found that some IDs had hundreds of "active" rows, a couple even had thousands, and the percentage of error was simply huge. Because of this, management for that company had been basing key decisions on seriously flawed data, and nobody could figure out why reality never matched the reports very well at all.

    Had the data been protected at every level, instead of one, it's highly probable that this scenario could have been prevented. It would have been easy to prevent, but they saw that as unnecessary because the business rule layer would take care of it. Kind of like the Titanic not having lifeboats, a double-hull, or sealed compartments, because the barrier walls would "take care of it".

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (1/4/2012)


    For each pseudo-ID, there was only ever supposed to be one row marked active, and all the rest were to have a 0 in that column.

    How would you enforce that in SQL Server? I'm pretty sure it can't be done reliably with CHECK constraints. Indexed views and filtered unique indexes won't stop you updating all rows to be Active=0. Would you use triggers?

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply