Losing Rows

  • Comments posted to this topic are about the item Losing Rows

  • A better question to ask is, when is it better to use a NoSQL-DB than an RDBMS?

    Which tasks do they handle reliably well as well as performing better than an RDBMS?

    And by 'better', I mean the criterion of choice: faster, more efficiently, handles massive volumes better and so on.

  • Sean Redmond (7/12/2016)


    A better question to ask is, when is it better to use a NoSQL-DB than an RDBMS?

    Which tasks do they handle reliably well as well as performing better than an RDBMS?

    And by 'better', I mean the criterion of choice: faster, more efficiently, handles massive volumes better and so on.

    From my understanding from other articles, NoSQL systems are really great when you need to scale-out to handle large amounts of data, or when you need to "shard" your data across geographic areas for performance reasons AND you can accept and deal with the lack of immediate consistency. Probably a poor example, but think an inventory system for a nation-wide retailer with multiple warehouses around the country to ship product from. For the customer, they don't really care which warehouse it comes from, so you collect all your inventory information into a bunch of NoSQL storage giving quick results that may not be 100% accurate (Oh look! They've got the 50 gallon drum of maple syrup in stock! But it's only at the warehouse on the other side of the country.) The actual "in-warehouse" inventory systems would more likely be a "classic" RDBMS so that they keep a better eye on what stock is going in and out.

    On the topic of the editorial and the MongoDB article, if I recall it's possible to get a similar sort of race condition, resulting in missing or wrong results, if you have a query using (wait for it...) NOLOCK...

    I think Gail has a blog posting showing how it's possible, from a couple years back (I could be wrong.)

    It comes down to, what are you willing to accept as limitations, can you work-around those limitations without spending too much time and effort and will it pay off, when choosing your tools.

  • The example I run into most often is search engine results. What matters most is speed and relevancy, in most cases, not consistency and "completeness" (whatever that would mean in a dataset that's the size of the whole web). It doesn't matter if you and I both search the exact same thing at the same time, and my 5000th result is different than your 5000th result. It may not even matter if my first 2 results are in reverse order of your first 2 results. What matters most is, do we get relevant results, and do we get them fast.

    In that situation, NoSQL solutions are "best". Can you image what would happen if searches caused lock-escalation in Google? Ouch!

    For a smaller solution, think of a CRM being used as a "sales opportunity" engine. If I plug in criteria that I want, and it searches a huge demographic database, and you plug in the exact same criteria, does it matter if we get the same lists? Probably not, so long as the lists are relevant and timely, and we'd like fast results. Same thing - NoSQL would work well for this kind of thing.

    There are lots of other examples of a similar nature.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • What was originally coined "NoSQL" should really have been NoACID, since they have since discovered that in order for their database platform to be marketable outside the realms of academia and the open source community (ie: revenue generating corporate customers willing to spend $,$$$,$$$), they have to provide a SQL language interface and tabular logical model as an abstraction layer to insulate users from JSON and MapReduce. However, the problem is that SQL and tabular data, by virtue of it's expected behaviour, depends on ACID-ity to produce accurate results.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • I get badgered about NoSQL regularly as a) it is a buzzword and b) everyone is big data obsessed. Having looked at NoSQL there are clearly areas in which it is applicable but not in the area (wellbeing/healthcare) that I am working it where data must be totally correct! Also the volume of data is not that large just that some people thing 40Gb is huge!

  • At the SQLBits event in Westminster a few years back Simon Munroe did an excellent presentation on NOSQL and where he saw it fitting in. At the time I challenged whether it should be called NORDBMS rather than NOSQL. NOSQL is more marketing friendly.

    They are tools to do a specific job and when used for the job for which they were intended they are very effective.

    REDIS is great for session state management. If you have ever had to manage web session state in SQL Server you will breath a sigh of relief for this one.

    MongoDB has been described as being to NOSQL what MySQL was to RDBMS's. That is a back handed complement if ever there was one. Document stores are great for things where you might be tempted to use an EAV model. Product catalogue management being a prime candidate. Surveys being another.

    Neo4J is great at handling graph traversals.

    Elastic Search is great for scaling largely read heavy documents especially where full-text search is required.

    Cassandra is great at write heavy loads. It's tunable consistency model is particularly interesting. when you think that iTunes has a 10PB Cassandra cluster that gives some measure of the scale these things can handle.

    The thing is when you think of what and RDBMS is NOSQL is such a broad categorisation as to be meaningless.

    There are so many types of NOSQL and so many products that it is nearly impossible to do them justice.

    The problems of eventual consistency are well known. As an ex-DBA I know that a large proportion of the things I had to worry about were invisible to application developers. Even if they had been visible it would be like trying to teach a drummer to play the lute. From what I have seen NOSQL can be a trap for the unwary. Their strengths allow you to paint yourself much further into a corner before it becomes apparent that you have done so.

  • So, depending on your scale, there are consistency issues in SQL Server. In flight transactions aren't shown in reports/queries, which can come to life in high volume systems with long transactions. SOA, messaging systems show this. Put an order in on Amazon on your laptop, check your mobile, it might not be there. But it's in the system and captured, in a transactionally consistent state.

    Go to sharding, or an Azure PaaS system with multiple databases, and you could have different states of queries as items complete on one db, and not another. Even in AGs, on a failover, you could have inconsistency between databases in the AG.

    I think we assume that consistency in the real world matches up with ACID definitions, and most of the time it does, but it certainly doesn't have to. There are real differences, and even then, for the most part, it doesn't matter.

    In healthcare, real time could be problematic. If you have alerts on dispensing (or administering) too much opiate, but it's a data entry problem, do you want an alert going when a practitioner will change 10.0 to 0.10 in 2 minutes? Probably not. How often does someone query actions that happen in real time and need them to be millisecond correct? I'm sure there are cases, but I bet it's also fewer than you think.

    There are places where SQL Server, PostgreSQL, MongoDB, Redis, and Neo4J will all work. There are cases where Redis outshines all others, such as session state that David mentioned. In fact, I know someone that moved session stuff out of SQL Server to Redis, leaving most data in SQL because it's better.

    Use what works, and understand limitations or potential issues.

  • Steve Jones - SSC Editor (7/12/2016)


    Do you know how many times I've had business people run a report and then someone else run the report a minute later and try to compare things?

    For me, the above is either not matching implementation to requirements or not educating users to the system limitations.

    As covered in the posts above, NOSQL has its place like most technologies. It is usually not the technology but the inappropriate selection or invalid implementation of said technology that is the issue.

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • You have to choose to right tool for the job. Counting vehicles as they pass under a bridge on an eight lane interstate highway is very different than counting financial deposits they pass from one account to another. In the case of counting traffic, it's probably the local government trying get an estimate of how busy the highway is at various times of the day. Whether it was actually 57,319 vehicles or 57,322 that passed under the bridge between 5pm and 6pm yesterday doesn't really matter if all they need to know is that the 30 day average stays below 65,000.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply