Mongo Jumbo Backups

  • Tony Davis

    SSCarpal Tunnel

    Points: 4385

    Comments posted to this topic are about the item Mongo Jumbo Backups

  • x

    SSC-Insane

    Points: 23578

    This question is for Tony, whats the largest mongodb installation that you've administered?

  • Jim P.

    SSCrazy Eights

    Points: 8725

    Another consideration is what is your business and is a point in time recovery (PIT) worth the time after 4 or 24 hours?

    If a manager were to delete the days transactions at 4P and realized it at 5P then yes a PIT is probably the best way. But if the manager deleted the records from before 01/01/2014 when it was supposed to be a date of 01/01/2013 and he finds out two days later a PIT back to two days ago loses all work done for the past two days at a cost to the all the employees that use the DB for double-entry, while trying to maintain normal production.

    I have always looked at a PIT as essentially a 24 hour or less decision. The daily backups should be used to restore in parallel and then copy the data across.



    ----------------
    Jim P.

    A little bit of this and a little byte of that can cause bloatware.

  • davewithers-574156

    Valued Member

    Points: 71

    great editorial...thought I was missing something about Mongo's backup capabilities on shards...now I know I'm not 🙂

    a correction tho, mongo is a document-store (JSON/BSON) database, not a "key-value" store (like hadoop) as mentioned in the writing.

  • Gary Varga

    SSC Guru

    Points: 82166

    For me this highlights, for what for me is, the fact that NoSQL databases are not replacements for relational databases and have totally different operational profiles. This may mean that backups are superfluous (may, MAY!!!) but, of course, it depends.

    As data professionals (DBAs, developers, architects of various types etc.) we need to consider data backup NOT database backup. I make this distinction because we need to consider why a backup is needed e.g. it may be "good enough" to generate it or even lose it - these are not offered as THE choices, just possible ones amongst many.

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • David.Poole

    SSC Guru

    Points: 75375

    From a developer perspective the NOSQL databases do cool stuff and make the developers task of interacting with data much simpler.

    The boring stuff that makes it all work is the really hard stuff. Distributed computing is hard. There is no getting away from it, there is no magic wand. The NOSQL databases allow you to make an informed choice. I will buy this way of doing stuff because I can bear the cost of doing it that way.

    The problem is that the choice isn't always that well informed or even informed.

    One of the selling points of a distributed system is that fault tolerance is built in by holding multiple copies of everything. Because of this extra level of redundancy, the argument goes, you can run this little lot on commodity kit.

    Commodity kit is much more reliable than it was but using commodity kit should be done with the expectation of hardware failure. What we really need to do is rehearse disaster recovery scenarios, learn from the mistakes and share the knowledge just as we used to do back in the early days of SQL Server/ORACLE/DB2/Sybase etc.

  • Jim P.

    SSCrazy Eights

    Points: 8725

    David.Poole (2/3/2014)


    From a developer perspective ... The problem is that the choice isn't always that well informed or even informed.

    But I have seen some of the uninformed development types. And that was with a set of Access DBs for ETL (stove piped systems). After the uninformed dev type was no longer with the company we rebuilt all his stuff from the ground up and regained about 25-30 hours of processing time from/for the level I staff.

    One of the selling points of a distributed system is that fault tolerance is built in by holding multiple copies of everything. Because of this extra level of redundancy, the argument goes, you can run this little lot on commodity kit.

    The problem with going to the commodity kit model is the typical management staff doesn't understand that buying a bunch of Dell Tungsten level servers with a RAID setup is still never going to have the same performance of a full up server designed with the production model in mind.

    So there is a combo of problems. And I have run into it before. I have had delivered apps that I delved into the SQL code and it was horrible on the SQL side. I had no clue of the exe/dll content, but the SQL didn't reassure me. The DB normalization was very questionable. It looked mostly like a 4th level norm then bast***tized to a 2nd level.

    The devs who are writing very questionable code meet up with the the user/buyer company management that will only buy crap commodity equipment and your app runs like crap.

    The buyer company IT team is screwed by management and the dev company because the end-user says "I should use a typewriter and a form." And what's sad is they are right.

    So I look at the NoSQL as a waste until the devs can do efficient code and the management is willing to invest in infrastructure.



    ----------------
    Jim P.

    A little bit of this and a little byte of that can cause bloatware.

  • David.Poole

    SSC Guru

    Points: 75375

    Jim, my experience is that you will rarely get isolated pockets of woes.

    If you don't like the SQL then a good developer will find that the code is wonky. A good network guy will find horrors in the network infrastructure. A good business analyst will look at the business processes and diplomatically say WTF!

    All in all you will have a system that isn't exactly uncrap

Viewing 8 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply