What is Big Data?

  • Comments posted to this topic are about the item What is Big Data?

  • Interesting yes. I fully agree that rdbms is far from dead, it's very much still what is used in many new products.

    Big data, awkward to work with, to say it like that is just silly. That's what people who can not grasp large numbers or do models in their heads says.

    However, for some systems the traditional rdbms databases are not well suited, but I do not think this really has to do with the concept or rdbms but the features of these databases. Like sql server and the c# clr will bog down and go really slow if pressured really hard but oracle will deliver. However, there are real cases where the rdbms is not practical. If you in say facebook were to find your frinds n3 friends, that is not code that will come that fast to you or it wont look that great while that works in nosql.

  • "There are lots of data sets that work well inside of a relational database, especially when you have built a solid data model and referential integrity. Many problems with relational databases come about because of poor code added to the platform, not a problem with relational platforms in general."

    Man, is that the truth! Very well said. However, in the end what always tends to get the blame first? The database.:-D

    "Technology is a weird thing. It brings you great gifts with one hand, and it stabs you in the back with the other. ...:-D"

  • To answer the question "What is big data," I would offer the followind definition:

    You are dealing with big data when handling of the rowsets of the given size becomes the make-or-brake criterion of the design.

    Note that this depends on the hardware - threshhold will be different for a server with four cores and 24 GB and for a server with 16 cores and 64 GB. Also, it depends on the performance criteria, because the same or similar JOIN may be too slow for a transactional app but perfectly OK for overnight load of cubes.

  • Revenant (5/16/2012)


    You are dealing with big data when handling of the rowsets of the given size becomes the make-or-brake criterion of the design.

    That's an interesting definition. It certainly makes some sense.

  • "Relational database technology is pretty much obsolete and archaic.", Geoffrey P. Malafsky, Ph.D. from the Karen Lopez link.

    What an idiot. Saying relational technology is obsolete is like saying algebra is obsolete; utter nonsense. Relational theory, which the technology is based on, is pure mathematics. I love the purpose of his company:Agile Data Governance and Standards. When you hear anything with "governance" in it get up and run. Those are code words for waste of time and high consultant fees.

    But what else would you expect from someone making such a ridiculous comment.

  • CrankyRat (5/16/2012)


    "Relational database technology is pretty much obsolete and archaic.", Geoffrey P. Malafsky, Ph.D. from the Karen Lopez link.

    What an idiot. Saying relational technology is obsolete is like saying algebra is obsolete; utter nonsense. Relational theory, which the technology is based on, is pure mathematics. I love the purpose of his company:Agile Data Governance and Standards. When you hear anything with "governance" in it get up and run. Those are code words for waste of time and high consultant fees.

    But what else would you expect from someone making such a ridiculous comment.

    Some fifteen years ago I heard many times that I was wasting my time when I was learning SQL Server. Future, I was told, belonged to object databases.

    Just ingore these academics and they will eventually go away.

  • "Just ingore these academics and they will eventually go away." - SSCrazy

    I wish. It just drives me crazy when these PhD's put out their shingle to exploit the fad this and buzzword that to the hopeful and ignorant. It can take a long time to unwind management from the latest shiny object, carefully explaining that the new toys are not such a good idea. They are so happy the holy grail has finally arrived and I'm the bearer of bad news. On top of that you got the young ones, fresh out of school, that think everything new is revolutionary (I did when I was young).

    There is good stuff out there but it is almost never the loud and bright.

    I should have went into accounting or law.

  • CrankyRat (5/16/2012)


    "Relational database technology is pretty much obsolete and archaic.", Geoffrey P. Malafsky, Ph.D. from the Karen Lopez link.

    What an idiot. Saying relational technology is obsolete is like saying algebra is obsolete; utter nonsense. Relational theory, which the technology is based on, is pure mathematics. I love the purpose of his company:Agile Data Governance and Standards. When you hear anything with "governance" in it get up and run. Those are code words for waste of time and high consultant fees.

    But what else would you expect from someone making such a ridiculous comment.

    Malafasky simply ran into the usual confusion between "not the newest version" and "obsolete". By the definition he's using, spoken communication is "obsolete", because, let's face it, it's a bit aged, and there are significantly more efficient means of data transmission available these days. So are wheels, knives, fire, walking, food, and a million other things.

    There's a big difference between actually obsolete (buggy whips, crossbows), old and proven (speaking, wheel, knife), semi-obsolete but fashionable (cooking over fire), on-the-way-out (landline phones), and should-be-retired-already-but-isn't (coal power plants). Relational data storage is "old and proven" so far as I can tell.

    There's a very good book on the subject: http://www.future-hype.com/book.php

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • And on the subject of "what is big data", I define it as any data where quantity has more of an impact on performance than normalization does. In other words, where you can get more of an impact on performance by managing storage than by managing normalization.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (5/17/2012)


    CrankyRat (5/16/2012)


    "Relational database technology is pretty much obsolete and archaic.", Geoffrey P. Malafsky, Ph.D. from the Karen Lopez link.

    What an idiot. Saying relational technology is obsolete is like saying algebra is obsolete; utter nonsense. Relational theory, which the technology is based on, is pure mathematics. I love the purpose of his company:Agile Data Governance and Standards. When you hear anything with "governance" in it get up and run. Those are code words for waste of time and high consultant fees.

    But what else would you expect from someone making such a ridiculous comment.

    Malafasky simply ran into the usual confusion between "not the newest version" and "obsolete". By the definition he's using, spoken communication is "obsolete", because, let's face it, it's a bit aged, and there are significantly more efficient means of data transmission available these days. So are wheels, knives, fire, walking, food, and a million other things.

    There's a big difference between actually obsolete (buggy whips, crossbows), old and proven (speaking, wheel, knife), semi-obsolete but fashionable (cooking over fire), on-the-way-out (landline phones), and should-be-retired-already-but-isn't (coal power plants). Relational data storage is "old and proven" so far as I can tell.

    There's a very good book on the subject: http://www.future-hype.com/book.php

    Thasnks for the link -- it looks very interesting.

  • GSquared (5/17/2012)


    There's a big difference between actually obsolete (buggy whips, crossbows), old and proven (speaking, wheel, knife), semi-obsolete but fashionable (cooking over fire), on-the-way-out (landline phones), and should-be-retired-already-but-isn't (coal power plants). Relational data storage is "old and proven" so far as I can tell.

    There's a very good book on the subject: http://www.future-hype.com/book.php

    Great analogy, great link. Thanks G^2

  • CrankyRat (5/16/2012)


    "Relational database technology is pretty much obsolete and archaic.", Geoffrey P. Malafsky, Ph.D. from the Karen Lopez link.

    What an idiot.

    I'm really happy to see someone else thought those exact words on this subject.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "Dear Lord... I'm a DBA so please give me patience because, if you give me strength, I'm going to need bail money too!"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • We'll know how big is "Big Data" when we know how long is a piece of string. The absurdity of any answer to that is in the lack of context specified. I like Gus's definition (it's "big" if the size matters more for performance than the structure) because it gives a context -- performance. But it may be perfectly legitimate to rate data as "Big" if its size affects other parameters; perhaps 5TB is not big, but 6TB would be "Big" for an environment which would need additional hardware to accomodate it.

Viewing 14 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply