Data Masker for SQL Server–Syncing Values Across Rows

Steve Jones, 2018-03-12 (first published: 2018-03-02)

I’ve been playing with Data Masker for SQL Server v6 and it’s an interesting product. I like the way it works, but I do find it a little challenging sometimes to figure out how to mask values. I’ve written a set of posts for different scenarios.

We had a customer ask about how to mask data across rows. The customer had some data in a table that was a standalone table, and contained data in a column that matched across rows. They wanted this changed, but the matching between rows kept.

In other words, here’s a small mocked set of the original data:

myid        Mychar     myint       mytinyint
----------- ---------- ----------- ---------
1           Steve      12345       1
1           Steve      12345       2
2           Andy       12345       3
2           Andy       12345       4
3           Brian      12345       5
3           Brian      12345       6
3           Brian      12345       7

Here are the results they want:

myid        Mychar     myint       mytinyint
----------- ---------- ----------- ---------
1           aaa      12345       1
1           aaa      12345       2
2           bbb       12345       3
2           bbbb       12345       4
3           ccc      12345       5
3           ccc      12345       6
3           ccc      12345       7

I thought this was an interesting scenario, so how do we mask this? It’s not that hard, so let me show you this.

First, let’s create a new masking set. I won’t walk through that here, but once you have a set connected to your database, here’s what we do.

First, we need to substitute data out. In this case, I’ll substitute the name only. In a real world, we’d probably need to substitute the myid and myint columns as well, but I’ll leave those again.

I add a new Substitution rule first.

This is a standard rule. I’ll add my column and pick a dataset. In this case, I’ll just pick make first names (Names, First, Male) and use that. This will result in a random set of names. I’ve chosen unique values. This is important as across a large number of rows, I could end up with random values that match, but with different MyID values. That would be bad.

If I save and run this rule, I’ll see something like this.

Not quite what I need, but it’s a start. The important thing is that the first myid=1 is different from the first myid = 2, which is different from myid=3.

Next we’ll add a Table Internal Sync rule. This is the rule that fixes values across rows inside a table. Here’s the basic config. Note that I choose a table and then I choose the columns that need syncing, in this case just the mychar column.

I need a way to determine which sets of rows should match. In this case, the myid column is used for that. If you examine the initial set, I have the same values for each name. This is what groups things together, so I’ll use this.

One Note: The red “I” to the right means this isn’t an indexed column. If I wanted better performance, I can add an index for this column, either permanently or just for the masking process.

Now I execute this rule, and I see these results:

I have my groups back.

I could expand this to include other columns as well, substituting the myint column in my first rule and including it in the second.

Setup Scripts

Here is the code to set this up:

CREATE TABLE MyTestMask
( myid INT
, Mychar VARCHAR(10)
, myint INT
, mytinyint TINYINT PRIMARY KEY
)
GO
INSERT dbo.MyTestMask ( myid,
    Mychar,
    myint,
    mytinyint
)
VALUES
  ( 1, 'Steve', 12345, 1)
, ( 1, 'Steve', 12345, 2)
, ( 2, 'Andy', 12345, 3)
, ( 2, 'Andy', 12345, 4)
, ( 3, 'Brian', 12345, 5)
, ( 3, 'Brian', 12345, 6)
, ( 3, 'Brian', 12345, 7)
GO

I’d urge you to give Data Masker a try if you’re looking to ensure compliant, safe data sets for your non-production environments.

Rate

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

Rate

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

Book Review: Big Red - Voyage of a Trident Submarine

by Andy Warren

SQLServerCentral.com

Blogs

I've grown up reading Tom Clancy and probably most of you have at least seen Red October, so this book caught my eye when browsing used books for a recent trip. It's a fairly human look at what's involved in sailing on a Trident missile submarine...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-03-10

1,439 reads

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

by Robert Davis

SQLServerCentral.com

Blogs

Question: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? This question was sent to me via email. My reply follows. Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? Databases to be mirrored are currently running on 2005 SQL instances but will be upgraded to 2008 SQL in the near future.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-23

1,567 reads

Inserting Markup into a String with SQL

by Phil Factor

SQLServerCentral.com

T-SQL

In which Phil illustrates an old trick using STUFF to intert a number of substrings from a table into a string, and explains why the technique might speed up your code...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-18

1,631 reads

Networking - Part 4

by Andy Warren

SQLServerCentral.com

Blogs

You may want to read Part 1 , Part 2 , and Part 3 before continuing. This time around I'd like to talk about social networking. We'll start with social networking. Facebook, MySpace, and Twitter are all good examples of using technology to let...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-17

1,530 reads

Speaking at Community Events - More Thoughts

by Andy Warren

SQLServerCentral.com

Blogs

Last week I posted Speaking at Community Events - Time to Raise the Bar?, a first cut at talking about to what degree we should require experience for speakers at events like SQLSaturday as well as when it might be appropriate to add additional focus/limitations on the presentations that are accepted. I've got a few more thoughts on the topic this week, and I look forward to your comments.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-13

360 reads

Data Masker for SQL Server–Syncing Values Across Rows

Setup Scripts

Rate

Share

Share

Rate

Data Masker for SQL Server–Syncing Values Across Rows

Setup Scripts

Rate

Share

Share

Rate

Related content

Book Review: Big Red - Voyage of a Trident Submarine

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

Inserting Markup into a String with SQL

Networking - Part 4

Speaking at Community Events - More Thoughts