Hi everyone,
I’ve been looking into using AI for helping with data-related tasks (writing queries, transforming data, explaining datasets, etc.), and something that’s been on my mind is how to deal with sensitive data.
In a lot of real-world cases, the data isn’t exactly something you can just paste into a tool — things like customer info, internal records, or anything confidential.
So I’m wondering how people here are approaching this:
Do you anonymize or mask your data before using it in prompts?
Or do you just recreate a simplified/sample version of the dataset instead?
I feel like this is one of those areas that doesn’t get talked about enough, especially when working with real production data.
Curious to hear what’s actually working for you in practice.