The Challenge of Deleting Data

  • Comments posted to this topic are about the item The Challenge of Deleting Data

  • GDPR gives people the right to the following:-

    • See what data you hold on them, SAR = Subject Access Request
    • To insist on corrections when your data on them is incorrect.
    • RTBF = Right To Be Forgotten.

    These all create headaches with archive/backups.  In the event of a restore you need to be able to replay the actions above up to the point of restore and possibly beyond.  You need a means of logging the actions outside of the DB being restored.

    In AWS S3 buckets are now encrypted by default and we can apply AWS KMS encryption keys beyond that.  The problem is storing the data in an easily recoverable format that is efficient to query and with a process that is efficient.

    What I have seen is an underestimate of the amount of manual effort required to enact GDPR processes when they have been a low priority, side-of-desk addition rather than a properly resourced design and implementation.

    There is an SSIS component that allows data to be written out to Parquet format, I just wish that there was a native SQL Server command to do the same or an option in BCP to do so.  That would provide a good foundation for a lot of long term archiving headaches.

  • I think the biggest challenge, as you allude to, is the effort to track and ensure this stuff is handled well. I've been concerned about the restore/replay issues, and ensuring across all places we keep data that we remove it or have a process to ensure removal over time.

    Like EKM, we've built a lot of tech that isn't easy to manage when things aren't flowing as we expect.

  • Really, going forward companies should minimize the amount of PII data that they collect about their customers. There comes a point of diminishing return where the usefulness of data in terms of business development doesn't warrant the risk of keeping it around.  I think that part of the problem are companies (typically startups in search of funding) who value their business in terms of digital assets when basically all they do is something more mundane like (for example) sell concert tickets or porn. Now they have this huge trove of sensitive data (probably not managed by the best and brightest DBAs) waiting to be exploited by hackers or venture capitalists.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • It's not even mundane things. I see some orgs that sell to other businesses and give them trials or PoCs with live data. When those companies move on without being customers, or when they stop being customers, there's a lot of risk holding that data. Yet plenty of companies don't recognize or account for this.

    Whether you work with consumers or businesses, every piece of PII is a liability. It might be an asset as well, but you should be sure you understand the net gain or loss.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply