• Eric M Russell (1/12/2015)


    I read an interesting book the other day (well actually it was back in the early 1990s) by guy named Kimball. His ideas sounded similar to this "data lake" thing, but he was calling it a "data warehouse".

    Going with the lake motif, there are natural lakes that accumulate organically over time, there are engineered lakes, and then there are lakes that pool up overnight when some beaver decides to dam up a stream and claim it as his own.

    That's what I'd seen it as, at least initially. However the points about cost of storage, and lack of transformation to get into tabular make some sense. There's a lot of engineering in RDBMSes that needs to happen for the data to be useful.

    I tend to agree. A data lake can be a well put together, or it can be a cesspool.