RE: Should the Data Lake be Immutable?

Mr or Mrs. 500

Points: 561

February 26, 2019 at 8:03 am

#2022974

I subscribe to the idea of using zones in a data lake, and having a "raw zone" that is immutable. The data in other zones should be replaceable by rerunning a repeatable process that "selects" data form the immutable raw zone and places/replaces it in the downstream zone. Melissa Coates has a great diagram showing a data lake with different zones for a visual: https://www.sqlchick.com/entries/2017/12/30/zones-in-a-data-lake