Big Data

  • Comments posted to this topic are about the item Big Data

    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    This thing is addressing problems that dont exist. Its solution-ism at its worst. We are dumbing down machines that are inherently superior. - Gilfoyle

  • It'll go the route every other new shiny thing in tech goes, I would imagine:

    1) There'll be a proliferation of home-rolled cobbled together codes to perform big data gathering, storage, access, and analysis.

    2) Some bright CS/Math Ph.D. student and/or prof will prove some basic theorems along these lines regarding efficiency, integrity, and the like.

    3) The programming world will coalesce around that foundational work.

    4) Frameworks will begin to appear based that work, resulting in standardization of the topic.

    4a) There will be non-stop religious wars of the form "zomg dotnet suxxors for this! clearly *nix is the only reasonable platform to do this on!".

    5) People will say it's no big deal, and start looking for the next new shiny thing.

    6) Throughout the entire process, there will be a group that nay-sayers the entire investigation.

  • In the past most data orginated in a database because it was deemed important enough by users that they entered it manually (patient charts, student records, inventory, etc.).

    Today, the bulk of "big data" is stuff like web click stream analysis, phone call records, highway traffic monitoring, serialized application objects, etc. Most of it (% of total volume) is derivative, duplicated, and mundane. We have to slog through gigabytes and terabytes of digital.. detritus.. to produce little nuggets of relevent business intelligence.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • For better or worse, big data is my daily bread. We are staging over 150 million rows a day, which I think is "big." I have been warned that it will be 250 million a day before the end of this year.

    Big data is not the future, it is the present.

  • What I've found so far is that Big Data is a nice vague term so loved by IT vendors rather like "Cloud".

    There is Big Data as in vast amount of data (either in number of records or in storage volume) and the challenges that go with that.

    There is Big Data as in a big technical challenge to mine because it is not in a form that traditional RDBMS were designed to handle.

    The latter required "Big Brains" to figure out how to get meaning from the verbal diarrhea and literary dysentry from social media sites.

    then there is the combination of the two.

    Some of the web analytics and click-stream data can be huge in terms of records but not necessarily as big as you think it would be in terms of TB.

    As ever one of the hardest challenges is to get some meaningful requirements out of customers for the big data. They kind of know what they want but can't really articulate it.

    If you have big data in the cloud then what sort of queries are you going to run against it? If it is analyze a humungous amount of data and bring back a simple and small answerset then you are in with a chance.

    If it is bring back a large data set then that is going to sting a bit.

    To make this work well or indeed at all database guys and storage/infrastructure guys are going to have to work so closely together people are going to think it is possible for Siamese twins to be born in different families!

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply