January 23, 2023 at 12:00 am
Comments posted to this topic are about the item The Challenge of Deleting Data
Follow me on Twitter: http://www.twitter.com/way0utwest
Forum Etiquette: How to post data/code on a forum to get the best help
My Blog: www.voiceofthedba.com
January 24, 2023 at 9:56 am
GDPR gives people the right to the following:-
These all create headaches with archive/backups. In the event of a restore you need to be able to replay the actions above up to the point of restore and possibly beyond. You need a means of logging the actions outside of the DB being restored.
In AWS S3 buckets are now encrypted by default and we can apply AWS KMS encryption keys beyond that. The problem is storing the data in an easily recoverable format that is efficient to query and with a process that is efficient.
What I have seen is an underestimate of the amount of manual effort required to enact GDPR processes when they have been a low priority, side-of-desk addition rather than a properly resourced design and implementation.
There is an SSIS component that allows data to be written out to Parquet format, I just wish that there was a native SQL Server command to do the same or an option in BCP to do so. That would provide a good foundation for a lot of long term archiving headaches.
January 24, 2023 at 7:04 pm
I think the biggest challenge, as you allude to, is the effort to track and ensure this stuff is handled well. I've been concerned about the restore/replay issues, and ensuring across all places we keep data that we remove it or have a process to ensure removal over time.
Like EKM, we've built a lot of tech that isn't easy to manage when things aren't flowing as we expect.
Follow me on Twitter: http://www.twitter.com/way0utwest
Forum Etiquette: How to post data/code on a forum to get the best help
My Blog: www.voiceofthedba.com
January 25, 2023 at 9:08 pm
Really, going forward companies should minimize the amount of PII data that they collect about their customers. There comes a point of diminishing return where the usefulness of data in terms of business development doesn't warrant the risk of keeping it around. I think that part of the problem are companies (typically startups in search of funding) who value their business in terms of digital assets when basically all they do is something more mundane like (for example) sell concert tickets or porn. Now they have this huge trove of sensitive data (probably not managed by the best and brightest DBAs) waiting to be exploited by hackers or venture capitalists.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
January 25, 2023 at 9:12 pm
It's not even mundane things. I see some orgs that sell to other businesses and give them trials or PoCs with live data. When those companies move on without being customers, or when they stop being customers, there's a lot of risk holding that data. Yet plenty of companies don't recognize or account for this.
Whether you work with consumers or businesses, every piece of PII is a liability. It might be an asset as well, but you should be sure you understand the net gain or loss.
Follow me on Twitter: http://www.twitter.com/way0utwest
Forum Etiquette: How to post data/code on a forum to get the best help
My Blog: www.voiceofthedba.com
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply