I’ve been playing with Data Masker for SQL Server v6 and it’s an interesting product. I like the way it works, but I do find it a little challenging sometimes to figure out how to mask values. I’ve written a set of posts for different scenarios.
In a previous post I showed how I could change column values across rows in a table. This is useful in that it moves my data from this:
to this:
That’s good, but since I haven’t changed the myid column, I really haven’t masked things well. In fact, it would be trivial to derive the initial mychar values from the set if I knew the original values.
Let’s fix that.
My masking set looks like this for now:
I want to add a new substitution rule that will mix up my myid values. I add the rule and configure the correct column. In this case I limit the boundaries of my random numbers to 1-100, but I could choose any value.
I save this and run just this rule. Now my data looks like this:
We’re partway there, but I don’t have grouping. Elric has two different values for myid, which is wrong. Let’s fix that. Just as in the previous post, we’ll now add a Table Internal Sync rule. We configure this in reverse of the previous article. Now we use the new, changed names in mychar as the group column and myid as the column to sync across rows.
If I execute this, then I get the following:
Tada.
Now I just need to ensure these rules run in order (all four) with dependencies and when I run the entire set, I’ll get my table synced. Here’s the final rule set. Notice that I’ve created dependencies.
I’ll save the set, and then re-run it. Now I get these results:
Data masking is something that’s never been this easy for me. I’ve built lots of large scripts that made perfect sense for me, but were difficult to turn over to anyone else for usage and maintenance.
I’d urge you to give Data Masker a try if you’re looking to ensure compliant, safe data sets for your non-production environments.