Discussion: is e-Commerce making a mockery of SQL Server?

  • Some of you may have heard of my latest tales of woe, in that I am in an environment that has Grails on the front end, along with MongoDB, MySQL, Redis, and Red Hat, and all of this is going in favor of Microsoft SQL Server. Assuming the obvious reasons are not the motivator (i.e. SQL licensing costs) I was dumbfounded to learn that premature optimization rules an iron fist in their ecommerce scheme. Where Foreign Keys are so greatly disliked for the sake of speed, that a 91 table database might have about 15 tables interconnected - and that is it. Now - that might be all that is needed, but when you factor in that any OLE DB / ODBC / JDBC device can now connect to this database - you have data integrity issues galore.

    I thought perhaps I was in an isolated nuthouse, but a recent conversation that I had with a fairly well respected DBA out here on the West Coast has proven that this is becoming a more common practice in e-Commerce scenarios (where speed is the key de jour, and data integrity? Who needs to worry about that...).

    Are we missing something here in the Microsoft SQL Server Community that is becoming a new way of doing business, or are people in the Western US who are using SQL Server for e-Commerce just nuts (where first to market on the latest e-Commerce idea means more to them than the security of the data they display).

    I would love for all of us to sound off on this one...

  • I'm really not sure exactly what you want to discuss.

    There are plenty of people that create denormalized databases and claim they did it for speed, and not just in e-Commerce applications.

    Usually, the answer is simple: they just didn't know how to create a normalized database, so they claim they did it for performance.

  • SQL_ME_RICH (7/30/2012)


    Some of you may have heard of my latest tales of woe, in that I am in an environment that has Grails on the front end, along with MongoDB, MySQL, Redis, and Red Hat, and all of this is going in favor of Microsoft SQL Server. Assuming the obvious reasons are not the motivator (i.e. SQL licensing costs) I was dumbfounded to learn that premature optimization rules an iron fist in their ecommerce scheme. Where Foreign Keys are so greatly disliked for the sake of speed, that a 91 table database might have about 15 tables interconnected - and that is it. Now - that might be all that is needed, but when you factor in that any OLE DB / ODBC / JDBC device can now connect to this database - you have data integrity issues galore.

    I thought perhaps I was in an isolated nuthouse, but a recent conversation that I had with a fairly well respected DBA out here on the West Coast has proven that this is becoming a more common practice in e-Commerce scenarios (where speed is the key de jour, and data integrity? Who needs to worry about that...).

    Are we missing something here in the Microsoft SQL Server Community that is becoming a new way of doing business, or are people in the Western US who are using SQL Server for e-Commerce just nuts (where first to market on the latest e-Commerce idea means more to them than the security of the data they display).

    I would love for all of us to sound off on this one...

    The insanity isn't relegated to just the West coast. It's here in the Mid-West, as well. I worked with a supposed "master" developer who knew all of the tools. He used those tools to design a couple of tables. All of the INT columns were BIGINT and all of the character based columns, even if meant to hold only 1 character, where all NVARCHAR(4000). When I asked him why, his answer was "Premature Optimization is the Root of all Evil."

    The statement "Premature Optimization is the Root of all Evil" was never meant to justify such a garbage design. Gail Shaw teaches the subject and I hope she shows up on this thread because she does a much better job than I at explaining what it really means.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Michael Valentine Jones (7/30/2012)


    I'm really not sure exactly what you want to discuss.

    There are plenty of people that create denormalized databases and claim they did it for speed, and not just in e-Commerce applications.

    Usually, the answer is simple: they just didn't know how to create a normalized database, so they claim they did it for performance.

    This is far too common. And every time I have seen it, the argument was the same - for performance. Refactoring the database to demonstrate the error of their ways has shown no issues with performance and that they just did not understand good db design.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • SQLRNNR (7/30/2012)


    Michael Valentine Jones (7/30/2012)


    I'm really not sure exactly what you want to discuss.

    There are plenty of people that create denormalized databases and claim they did it for speed, and not just in e-Commerce applications.

    Usually, the answer is simple: they just didn't know how to create a normalized database, so they claim they did it for performance.

    This is far too common. And every time I have seen it, the argument was the same - for performance. Refactoring the database to demonstrate the error of their ways has shown no issues with performance and that they just did not understand good db design.

    The claim of having denormalized for performance is usually easy to shoot down. Just ask to see the normalized model that they denormalized from.

  • I worked on a project for about three years (it was supposed to take 18 months) that still wasn't done when I left the company. They too had jumped on the premature optimization band wagon taking it to mean, as many do, that you don't have to design anything, ever. They primarily used nHibernate as the means of ignoring the database (instead of ORM, they made it into OOM, Object-to-Object-Mapping). It completely eliminated the database and DBAs from their development process...

    Now, 4.5 years into the project (again, it was only going to be 18 months long, eliminating the DBAs is part of what makes it so much faster to deliver...) they still haven't delivered and they're finding HUGE performance issues. Plus, here's a shocker, people actually expect to be able to run reports on the data, but since it's stored as objects, you can't easily cut across it. They are throwing money at the problem currently. They've implemented Availability Groups to create a read-only mirror of the database so that they can remove reporting from the production system, but that didn't fix the performance issues. Now, they're working on a full-blown database design and will build a load process to go from the object model to a relational model, but only for reporting. It's a nightmare. What's more, it's a nightmare that I predicted, in a detailed report, written 4.5 years ago when the project started.

    You're not alone.

    What I'd say is happening is that developers seem to be more and more frustrated with the persistence of data. Combine this with a lack of understanding of data systems and you get ORM tools, No-SQL movements, pre-mature optimization as a design paradigm, etc.. It's all an attempt to bypass the thing that is hard for them. Unfortunately for all these guys, the core mechanisms of relational storage, either in OLTP normalized systems or Star schema's or whatever, still works. But it only works when you do it correctly.

    By the way, I'm not against ORM tools. I think they're great. But you still need a database design. I'm not against MongoDB or any of the other NoSQL databases, but when you have to do reporting against them, you're going to relational storage again. The simple fact is, the relational model is a good one and nothing has yet come along that will completely replace it. Not yet.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • I really do appreciate you all sounding off here. It really rattled me earlier this year when I was told that the place I worked at didn't like the Referential Integrity down at the database level. When I heard this in the way of someone else saying "yeah - that's good" (in reference to their being no RF in the database), I was starting to think I was being left in the dark in understanding why SQL was being used this way.

    The all too common response is as you have all pointed out - we need speed, and the engine doesn't have to worry about looking at RF when we do it up in the app.

    My response to this is often with a smile and a nod, but inside I am just fuming with frustration over a mentality that is prevailing in today's e-Commerce (and other) markets.

    SQL Server (from what I have studied) was not meant to be compromised in this way, but if people believe that they will lose overall query speed for their app by putting an appropriate design in place for RF - what can you do?

    :crazy:

  • I too have seen all kinds of justifications for various data-nightmare scenarios.

    One place I worked, they were proud of the fact that there were no delete statements ever issued to any of their databases, and the only updates were to mark a "Current" column 0 instead of 1. Everything else was inserts.

    They were completely positive they had this virtual delete/update environment fully in-hand and perfected. When I started there, I found that they had thousands of "ID" values that had as many as thousands of rows marked "Current", in violation of their model (supposed to only ever be one Current/"ID"), and no way to tell which one was correct in many cases.

    Every report they had given to management about the business, for over 2 years, had been egregiously wrong, in terms of minor things like sales volume, customer retention, inventory, and everything else.

    All referential integrity was handled in the front end, because the fake ID values they used prevented having FKs in the database. They had tens of thousands of rows of orphan data, which were again falsifying reports, graphs, etc.

    When I found this, the solution the manager responsible for those systems came up with was fire me and hope nobody ever found out.

    Fun times.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Ow G2, that SUCKS. Been there done that though. However, my version of that report got every consultant let go because they couldn't 'let go' normal employees while temps worked in their department.

    I'm not the biggest fan of Foreign Keys, personally. They can be a royal PITA when trying to do bulk work against a database and can slow things down just an extra hair. That said, there's a huge difference between 'Do we need a Foreign Key here?' and 'NO FOREIGN KEYS EVAH!'

    Gyeah, I'd like to take ORM out in the back yard and Ol' Yeller it some days when I hear these horror stories. Got a few of my own at the moment but they're from a different app vs. DB mindset.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Premature optimisation is the process of making design decisions or modifications based on assumed performance improvement without testing, benchmarking and verifying the change.

    "Foreign keys are slow so I won't create them" is in fact an example of premature optimisation. (unless someone has sat down, done the measurements and found that in this particular case the foreign keys are unacceptably slow)

    "I'm denormalising for performance" is another common case of premature optimisation, someone has assumes that normalised designs are slow and denormalised without testing that assumption (at least I've never encountered a case where they did test).

    Optimising queries or tweaking data models or playing with hints or indexes without any measurements and without any idea of the results of the fiddling is premature optimisation

    Creating a good, solid, scalable design is not premature optimisation. It's doing a good #@$@# job.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • i have seen this happen in some companies where they used code first in EF and hibertante etc. On the other had I worked with a company that handle the situtation beautifully using in memory databases which then generated log which were imported into a normal sql database for later reporting. I think the database they used was volt DB or kdb+ cant remember now.

    Jayanth Kurup[/url]

  • I don't usually see foreign keys as a performance question, really. How fast your database and code are really don't much matter if the data they retreive is wrong. Wrong data served up really, really fast is much worse than correct data that's marginally slower. I'd rather make managers wait a couple of extra milliseconds, than have them assign incorrect sales bonuses/commissions because the data is flawed.

    I don't really care if "optimization" is premature or not, anywhere near so much as I care if the data is as correct as possible, within valid precision/accuracy models.

    So FKs for data protection matter a lot to me. Possibly because I've never yet seen "we can do that in the front-end/DAL/business-rules layer" do anything other than fail horribly. Consistency is only a virtue when it's consistently right, after all.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Grant Fritchey (7/31/2012)


    I worked on a project for about three years (it was supposed to take 18 months) that still wasn't done when I left the company. They too had jumped on the premature optimization band wagon taking it to mean, as many do, that you don't have to design anything, ever. They primarily used nHibernate as the means of ignoring the database (instead of ORM, they made it into OOM, Object-to-Object-Mapping). It completely eliminated the database and DBAs from their development process...

    Now, 4.5 years into the project (again, it was only going to be 18 months long, eliminating the DBAs is part of what makes it so much faster to deliver...) they still haven't delivered and they're finding HUGE performance issues. Plus, here's a shocker, people actually expect to be able to run reports on the data, but since it's stored as objects, you can't easily cut across it. They are throwing money at the problem currently. They've implemented Availability Groups to create a read-only mirror of the database so that they can remove reporting from the production system, but that didn't fix the performance issues. Now, they're working on a full-blown database design and will build a load process to go from the object model to a relational model, but only for reporting. It's a nightmare. What's more, it's a nightmare that I predicted, in a detailed report, written 4.5 years ago when the project started.

    Wow, I can't believe they are still pushing forward in the same manner. I remember these stories about the project some time ago and am dumbfounded that they have not changed (well except that you left them:-D )

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • GilaMonster (7/31/2012)


    Creating a good, solid, scalable design is not premature optimisation. It's doing a good #@$@# job.

    +Infinity

    Especially for the curse words.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • SQLRNNR (7/31/2012)


    Grant Fritchey (7/31/2012)


    I worked on a project for about three years (it was supposed to take 18 months) that still wasn't done when I left the company. They too had jumped on the premature optimization band wagon taking it to mean, as many do, that you don't have to design anything, ever. They primarily used nHibernate as the means of ignoring the database (instead of ORM, they made it into OOM, Object-to-Object-Mapping). It completely eliminated the database and DBAs from their development process...

    Now, 4.5 years into the project (again, it was only going to be 18 months long, eliminating the DBAs is part of what makes it so much faster to deliver...) they still haven't delivered and they're finding HUGE performance issues. Plus, here's a shocker, people actually expect to be able to run reports on the data, but since it's stored as objects, you can't easily cut across it. They are throwing money at the problem currently. They've implemented Availability Groups to create a read-only mirror of the database so that they can remove reporting from the production system, but that didn't fix the performance issues. Now, they're working on a full-blown database design and will build a load process to go from the object model to a relational model, but only for reporting. It's a nightmare. What's more, it's a nightmare that I predicted, in a detailed report, written 4.5 years ago when the project started.

    Wow, I can't believe they are still pushing forward in the same manner. I remember these stories about the project some time ago and am dumbfounded that they have not changed (well except that you left them:-D )

    I think in this case, the project has effectively failed a couple of times already so they're pushing ahead regardless of issues with methodology. I know that no other projects are following this approach at the company when, for a time, this was going to be the New Model.

    But yeah, a big part of what made me quit was this project. When I'm standing there, telling them everything that is going to go wrong and getting ignored... Then, when they brought in a Microsoft consultant who told them what I told them... Then, when that consultant actually said to them, "Why did you bring me in for this? You have Grant sitting right across the hall." Just couldn't stand it there after that.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

Viewing 15 posts - 1 through 15 (of 27 total)

You must be logged in to reply to this topic. Login to reply