• roger.plowman - Tuesday, February 26, 2019 7:24 AM

    The first question that should be asked is, should you even have a data lake or data warehouse?

    Harking back to the whole security issue, a data lake is precisely the kind of holy grail hackers would be salivating for. Since you're dumping (mostly) raw data into it, what are the chances that it contains PII? Or even sensitive information that could embarrass/seriously threaten your company?

    Second, if you make the data immutable how do you update data that's erroneous? Or delete data in accordance with GDPR / some as yet unwritten law?

    I suspect immutability should be asked after asking if you should even have the data lake or warehouse in the first place.

    In a lot of cases yes there is usually a lot of value for a company to be able to see historically what changes have been made to data over time.  And as was mentioned above from auditing perspective it might in fact be required to store historical change of PII.