An Inconceivable Scale

  • Comments posted to this topic are about the item An Inconceivable Scale

  • I guess that suggests that you visualise Steve. I can't. I tend to conceptualise (which markedconn referred to as cognitive thinking as opposed visual thinking in a recent thread).

    This means that I tend not to have issues with infinity let alone large numbers. On the downside when I get ask how x (present) will go with/in/on y (not present) then I am completely clueless.

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • A part of this increase in data-size is our sloppiness in database design. We throw in GUIDs everywhere partly because it is in fashion, partly because it makes development easier [1] and partly because we can. We use unicode by default because it saves us having to worry about codepages. We use datetime2(7) when a simple date would do. We use int and bigint when tinyint would do, e.g. for an enumeration table with the list of states in it. Sizewise it makes almost a negligible difference to the size of the enum table, but in the table with 120 million rows, the difference alone is almost 70MB, and that is before any indexes that use it are taken into consideration. But then storage and resources are cheap, until they are not.

    We are no longer concerned about compactness and efficient resource-management, which I find to be a pity.

    [1] We might be developing a webservice that'll be using this table in the future so we need to put in GUIDs now!

  • Using the wrong sized data type means that it is harder to get it right. The perfect example is the use of a date & time type when only a date is required. Not only is it wasting data but also causing potential issues e.g. the following:

      1) Potentially breaking equality comparisons.

      2) Unnecessarily bringing in time zone issues.

      3) Tempting future abuse of unused time element.

      4) Losing clarity of solution.

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • I would suggest that the growth in the volume of data stored does not reflect a growth in actual information.

    If you look, how much of that growth is a result of our Xerox mentality? We make endless copies of the same data, sharing it out, stashing copies here and there just in case, sometimes rearranging it for yet another effort at divining information from it. For example, even with the huge volume of sales transactions taking place every day, there is a finite amount of actual information generated by those transactions, and it is orders of magnitude smaller than the massive volume of data generated, passed around and archived by the stores, banks, credit-card vendors, fraud-detection services, Department of Treasury, and whatever other parties I haven't thought of.

    The other driving factor is our collective obsession with the idea that all data has value and should be preserved until we can find a way to derive that value. The harsh reality is that most of what we are preserving is as full of noise as surveillance video -- minutes (or mere seconds) of key information buried in thousands of hours of the camera watching people do ordinary things.

    Stop and imagine for a moment how long you could get people to sit and nod their heads to a straight-faced presentation about the value of a sensor-enhanced Roomba which would map how much lint and dust was pulled from each square inch of the room and relay that data to a cloud-based cleaning analysis service using highly optimized proprietary algorithms to generate an adaptive optimized route for cleaning your house. Then think about how much of our data is about as vital as ( room, x, y, dustballsize ).

  • Just bought the grandson an XBox One with 1 TB. How much space do you really need to store a game?

  • On the macro scale it is all just noise or pretty pictures. Rather like looking out into the cluster of galaxies at the center of the Milky Way.

    Since we are always looking at it on some micro level and can never see the actual "big picture", perhaps its best to quit trying to comprehend the magnitude of it and just enjoy the night sky.

  • " However as humans, we will create multiple exabytes (EB) of data this year, an order of magnitude beyond the PB"

    Actually I thought a exabytes where 3 (decimal) orders of magnitude more....

  • Iwas Bornready (12/9/2015)


    Just bought the grandson an XBox One with 1 TB. How much space do you really need to store a game?

    Yeah really, I think my first computer only had 16 K of memory, how far we have come.

    -------------------------------------------------------------
    we travel not to escape life but for life not to escape us
    Don't fear failure, fear regret.

  • Steve,

    It has been a long time since I was in school, but I thought an Exabyte would be 3 orders of magnitude larger than a petabyte.

  • I am Brian and so is my wife!!!

    Perhaps it is three orders of magnitude Steve?

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • william-700725 (12/9/2015)


    I would suggest that the growth in the volume of data stored does not reflect a growth in actual information.

    .....

    Agreed. There is little or no practical use for some of these huge swaths being created, it's just there because it's too much trouble to get rid of the junk.

    Even trying to draw some future information out of it is likely of little real use, because much of this data is drawn from an undocumented and unverified mix of other data sources. So it appears we have a lot of information, but what we have is largely garbage.

    The real danger is, however, using the garbage as if it represents actual information.

    ...

    -- FORTRAN manual for Xerox Computers --

  • below86 (12/9/2015)


    Iwas Bornready (12/9/2015)


    Just bought the grandson an XBox One with 1 TB. How much space do you really need to store a game?

    Yeah really, I think my first computer only had 16 K of memory, how far we have come.

    I started with 5K (ignoring the 16k RAM cartridge). Now with games being multiple GBs each, especially with two games per month with Xbox Gold, then it is easy to fill a hard drive.

    Star Wars Battlefront anyone? 😛 (tee hee)

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • Because of the alphabet company my 5 year old daughter often asks me what a google plus some number is.

    412-977-3526 call/text

  • Sean Redmond (12/9/2015)


    A part of this increase in data-size is our sloppiness in database design. We throw in GUIDs everywhere partly because it is in fashion, partly because it makes development easier [1] and partly because we can. We use unicode by default because it saves us having to worry about codepages. We use datetime2(7) when a simple date would do. We use int and bigint when tinyint would do, e.g. for an enumeration table with the list of states in it. Sizewise it makes almost a negligible difference to the size of the enum table, but in the table with 120 million rows, the difference alone is almost 70MB, and that is before any indexes that use it are taken into consideration. But then storage and resources are cheap, until they are not.

    We are no longer concerned about compactness and efficient resource-management, which I find to be a pity.

    It's an art that's largely dying out, most people would rather use a larger data type and not have to worry that some day they might need to extra storage particularly for things like dates or ints. Compared to 40-50 years ago storage space is rarely the bottleneck in a modern database so what in the past would have been done by necessity is now largely not worth the time.

Viewing 15 posts - 1 through 15 (of 39 total)

You must be logged in to reply to this topic. Login to reply