Limitations of MongoDb

  • We started using MongoDB to cache data for our front-end to make it a little easier to present and not constantly hit the OLTP system while working with that data. The first thing we learned is that order of operations is not enforced the same way as SQL Server. We did a "delete, insert" to clear out existing data then repopulate. MongoDB decided that "insert, delete" was more efficient and chose that, resulting in an empty set for the user.

    We resolved that and were able to move on. That was an early version of the software and driver. Later, we were able to take advantage of the expiration of data in Mongo for the caching purposes and moved from the ASPState Session state database to MongoDB. That helped our performance quite a bit because MongoDB is designed for that sort of data - unknown amount of data in the session, short lifespan, quick to read/write by session id. That was definitely one of the better moves we had for it.

    My biggest concern - we still don't have a good (or at least tested) backup solution for MongoDB. I'm hesitant to store anything there that is necessary and could be lost. It's also a bit hard to join that data into other queries so storing data there that could be useful when combined with data in the OLTP/reporting systems is hard to justify. That being said, it's working really well to cache short-term data for our users and for sessions. I'm a big fan of NoSQL solutions for things like that.

    To the author's point - yes, some people see MongoDB as the tool to deliver everyone out of the hands of the OLTP systems. As others have pointed out, they tend to miss the bigger issues that could come up just because they don't want to deal with something like SQL Server. I think it will balance out, but some data may be lost along the way. :-/

  • Brandon Forest (3/10/2015)


    It helps when a DBA comes up from the developers ranks

    Up from the developer ranks? Someone thinks highly of his DBA self 😉

  • unaligned (3/10/2015)


    Brandon Forest (3/10/2015)


    It helps when a DBA comes up from the developers ranks

    Up from the developer ranks? Someone thinks highly of his DBA self 😉

    LOL

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • Like all technologies, I think the real issue is when a tool is promoted and used far beyond what is was intended to do. Reading some of the earlier NoSQL hype, its easy to think that the goal is to replace all that came before. "Hadoop will replace relational databases...."

    I did get a bit of humor from one of the earlier comments about how Mongo is easy to administer, yet others have indicated that backup/restore was not. Humm? Especially since SQL Server, since version 7, has been perhaps a bit too easy to backup and restore ("..., Oh, our executive admin runs the database backups..."

    The more you are prepared, the less you need it.

  • About a year ago, a big technology company started a complete re-write of their applications, with the initial hope of completely eliminating their relational database system. Their initial goal was to use Cassandra for all activity. About a 6 months into the process, they finally realized that Cassandra's lack of/minimal support for transactions, FK's, join's, etc was prohibitive.

    So, what was initially a total re-write become, more of an update. Where they went back to using those nasty relational databases for such things as transactions and data integrity.

    The more you are prepared, the less you need it.

  • Eric M Russell (3/10/2015)


    In other words, if you update a MongoDB document, perhaps something like adding additional office visitation records to a patient's chart, then the change of size operation would require that the updated document be re-written to a new physical location on file. I'm thinking here about fill factor, page splits, and fragmentation.

    I agree with the point of your post but I have to jump on this paragraph as a poor example.

    Most of what has been said here is that there is a place for a data storage system where a lack of data integrity is tolerable. I submit that medical data is not a place where overall integrity or data loss is tolerable.

    I get the document store idea and that is a great thing. Likewise building on that is the idea that a document may or may not be related to any other document in the store. There is a place for data hives like this.

    ATBCharles Kincaid

  • Andrew..Peterson (3/10/2015)


    About a year ago, a big technology company started a complete re-write of their applications, with the initial hope of completely eliminating their relational database system. Their initial goal was to use Cassandra for all activity. About a 6 months into the process, they finally realized that Cassandra's lack of/minimal support for transactions, FK's, join's, etc was prohibitive.

    So, what was initially a total re-write become, more of an update. Where they went back to using those nasty relational databases for such things as transactions and data integrity.

    Well, it must have been a cool project to work on while it lasted, and I'm sure that half the developers on the team have since moved on to offer their database migration expertise to yet another unsuspecting big corporation. :unsure:

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • I really don't understand this fascination with the lack of concrete foreign key constraints. MongoDB supports references[/url]. Yes, it requires that your devs know what they're doing, but honestly, if they don't, then why were they hired?

    Transactions, as noted here, are a different animal. Data stores that require that data integrity is enforced should probably not use MongoDB.

  • Charles Kincaid (3/10/2015)


    Eric M Russell (3/10/2015)


    In other words, if you update a MongoDB document, perhaps something like adding additional office visitation records to a patient's chart, then the change of size operation would require that the updated document be re-written to a new physical location on file. I'm thinking here about fill factor, page splits, and fragmentation.

    I agree with the point of your post but I have to jump on this paragraph as a poor example.

    Most of what has been said here is that there is a place for a data storage system where a lack of data integrity is tolerable. I submit that medical data is not a place where overall integrity or data loss is tolerable.

    I get the document store idea and that is a great thing. Likewise building on that is the idea that a document may or may not be related to any other document in the store. There is a place for data hives like this.

    Yes, data integrity is especially important for something like healthcare records. My point above was that, generally speaking for any CRM type application, if all data for an object is serialized into a self contained document, then subsequent appending of new data items becomes an in-place update rather than an incremental insert, which in many cases would actually be a delete followed by another full insert. That would pose a problem for something like a CRM application that has many dimensions and gets appenended to many times after the initial insert.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • I currently support Mongo, SQL Server, MySQL, and increasingly PostGres, which offers an interesting hybrid document db datastore on top of a relational db.

    The foreign key discussion is a red herring. The whole point of a document database is that the entire "business fact" is contained within the individual record, so foreign keys are redundant. Transactions are a real issue, but only for operations that need to affect a bunch of documents consistently at a point in time, like end of month close type use cases. For many systems the fact that individual document writes are inherently consistent is good enough. Within that limitation data written to mongo is as "safe" as an RDBMS provided you understand the limits.

    We use mongo for a huge variety of use cases. We are a global company with a need to have our full databases available globally but optimized for local reads and writes. Mongos tag aware sharding simplifies that tremendously versus sharding traditional RDBMS (which have issues with FK's and transactions themselves). Our business cases do require consistent reliable writes, but minimal real time batch processes, which we handle within the application. We also use it as a "rich" cache and as the primary datastore for less critical attributes of entities in our transactional system. I.e. customer metadata that is not necessary to process transactions lives in mongo, which keeps the transaction processing pipeline lean.

  • cdesmarais 49673 (3/10/2015)


    I currently support Mongo, SQL Server, MySQL, and increasingly PostGres, which offers an interesting hybrid document db datastore on top of a relational db.

    Great, so what we are hearing is that the tool matches the need.

    A million points (as the say in Who's Line is it Anyway)

    The more you are prepared, the less you need it.

  • Andrew..Peterson (3/10/2015)


    cdesmarais 49673 (3/10/2015)


    I currently support Mongo, SQL Server, MySQL, and increasingly PostGres, which offers an interesting hybrid document db datastore on top of a relational db.

    Great, so what we are hearing is that the tool matches the need.

    Exactly using the right tool for the right job, taking advantage of the strengths of the tool while minimizing the effect of the weaknesses. Excellent example!

  • I'm in the "this misses the point" camp.

    Lets take transactions as an example. In the relational world a "thing" would be represented by a set of related tables with appropriate foreign key constraints. Population of that set of tables would necessitate the use of transactions.

    In the Document database world this is dispensed with simply because the entire "thing" is a single document. The "thing" is committed as an entire entity or not at all thereby achieving the same goal as fulfilled by multi relaltional table transactions.

    Then lets take the schema argument. In the hands of cowboys then yes, whatever ill thoughtout bodge can be slap dashed into a document database provided it is a valid JSON document. I've seen similar things done in relational databases and if you haven't, well there is always your 2nd week as a DBA to come.

    I digress. Schema enforcement is possible through the use of a serialisation framework such as Avro, Thrift, Kyro etc. The application makes use of such a framework to enforce a defined schema. Schema on write scales out because the framework is part of the application. Granted that the serialisation framework isn't part of MongoDB or any other document DB for that matter. (I expect a MarkLogic guru to pipe up at this point).

    MongoDB does 80% of what an RDBMS does? Absolutely not. The more accurate statement would be that MongoDB does 80% of what developers think an RDBMS does and what the developers need.

    Good use cases for Document databases are capture of flexible web form data. Product catalogue management where there is an extremely diverse set of attributes on products or generally use cases where there is a high degree of optionality/cardinality in attributes.

    The NOSQL world is waking up to the value of some form of structured query language. Years ago I raised the point that NOSQL actually meant NO-RDBMS and got shot down in flames. With the adoption of CQL, N1QL, AQL and lots of things that look a lot like SQL I'm saying "I told you so". Getting data out of a NOSQL solution and into a data warehouse is still a problem to be cracked however projects such as Apache Spark and Apache Drill (based on Google Dremel) clearly show that considerable thought has gone into mass processing of strangely shaped data.

    I think it is as dangerous for a DBA to say "NOSQL doesn't fulfil my companies use case" as it is for NOSQL vendors to expect traditional RDBMS vendors not to react to their challenge. SQL Server column store is not relational. FileStream is not relational. Full-text indexing is not relational (its not very good either but that is another story) In-memory OLTP is not relational etc. SSIS DataReader destinations allow SSIS to transform bizzarre data on the fly to something that would consume a DataReader.

  • I think this describes some of the disconnect between the only NoSQL camp and the only SQL camp.

    **beware some language**

    https://www.youtube.com/watch?v=Nu1UQblRQdM

  • David.Poole (3/10/2015)


    Years ago I raised the point that NOSQL actually meant NO-RDBMS and got shot down in flames. With the adoption of CQL, N1QL, AQL and lots of things that look a lot like SQL I'm saying "I told you so".

    David, I think you're correct in the true meaning of NoSQL databases. The only reason I can think of that people think of them more as NoSQL instead of NoRDMBS is that for the most part the only interface to a NoSQL database is via procedural programming.

    Years ago I spoke with a group that was very proud of their MongoDB implementation that sharded their data over 10 servers and took 8 hours to process although I had a SQL Server setup on my laptop that did the equivalent process for an even larger data set in 45 minutes. I don't think the issue was MongoDB but more in how they utilized it with their Ruby-on-Rails application. They used MongoDB merely as a data storage mechanism and had their code do all the work. I still think that set-based processing of a sorts is still possible with NoSQL environments but these guys didn't give it a chance by having their code literally looking at each document to determine if it was what they wanted before moving on to the next.

    Can set-based queries or requests be made of a NoSQL implementation to off-load some of the work to the DB server? It seems to me that it would have to have that capability otherwise there would be no more value to it than searching a directory of files.

Viewing 15 posts - 16 through 30 (of 38 total)

You must be logged in to reply to this topic. Login to reply