Dynamic Data Masking and Adhoc queries.

  • This is the first time we are implementing this, Data Masking that became available in SQL Server 2016.

    I am not sure that this is a best Data Masking solution though.

    Because documentation says

    that a non-privileged user can run an ad-hoc query, using some functions like CAST, and data becomes visible.

    Is my understanding correct? DDM should only be implemented together with not allowing non-privileged users to execute ad hoc queries?

    and IF SO, then DDM is really not a good choice for data masking?

    Likes to play Chess

  • DDM is NOT a good choice for anything - for ad-hoc users its not like they can unmask it but they can infer the data and after multiple attempts they can determine the exact content of each value.

    DDM would only be useful if MS had implemented it in a way that was "masked" before any where clause applied to the contents of the columns - as that was not the case it it a worthless option.

    as an example (and ms docs have another example) I can do the following.

     

    select * from mytable where maskedcolumn like 'A%' -- this gives me all records starting with A on that column

    then I do

    select * from mytable where maskedcolumn like 'AB%' -- this gives me all records starting with AB on that column

    and so on.

    other methods can be used to determine the content.

  • I wouldn't use DDM as a "security feature" to save my soul because of the very problem that you state.  If you need to make it so someone can't actually see some data, then it needs to be properly encrypted.  Period.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • It is with the purpose of letting an offshore team access a Dev server but they must not see PII.

    Likes to play Chess

  • WHAT!!!!!!

    Non Prod SHOULD NOT have PII information.

    Get your processes changed so dev gets a version of the prod database that has gone through a process of removing PII information from it.

  • So what mechanism of implementing data masking for PHI/PII would you suggest? if not using DDM.

    A third-party tool?  (Apex Data Mask?..)

    OR A few custom stored procs updating all sensitive columns data with a mask after live databases copies has been restored on target (masked) server?

    What we are looking to do:

    we have 20 columns in 15 tables  (in 3 different databases)  that need to be masked on the Dev server that will be exposed to offshore teams.

    Likes to play Chess

  • stored procs is the easiest and fastest to implement for a small scale like that.

    restore prod db to another instance (protected), run data desensitization process, backup and make backup available for other non prod instances to use.

    we do this with some high volume db's - and resulting dbs are used not only for non prod but also for BI where need for PII information is negligible (most that could be used on our case would be age for small number of reports and residential area (and this one can be masked and replaced with just a internal code used by BI))

    all other details (name, email, full address, policy information, medical details are either removed or replaced with random data)

  • Is there any existing example somewhere? I could not find on the Internet. Like a generic sp_obfuscate_varchar_column, something like that? T-SQL based.

    Likes to play Chess

  • Use ENCRYPTBYASYMKEY and DECRYPTBYASYMKEY.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • I think that even you are capable of doing a "update t set firstname = 'xxxx' from mytbl" without the need to give you examples from the net.

     

  • We set "Attributes" on a column (in our own Meta Data) for things like this.

    I would hate to intrust it to a bunch of updates, in a script disassociated from the Table / Column Design/Maintenance processes as, for us, we consider things like that as part of Table/Column design and subsequent Table/Column review

    Having set the appropriate attribute then dynamic SQL takes care of it, so we don't have to worry about whether a column is accidentally not included in an UPDATE script and so on. Also, if we find that we forgot to include a column just "setting the Column Attribute" will mean that tonight's warehouse-copy will obfuscate that column. (Our actual process requires some oversight otherwise someone could just "untick" them all 🙂 )

  • First, on Dynamic Data Masking: https://www.sqlservercentral.com/steps/permissions-and-security-in-dynamic-data-masking (see the data leaking section).

    There are tools, like SQL Data Masker (https://www.red-gate.com/products/dba/data-masker/), that do this. Ultimately, there isn't anything built in. Scripts will often need to be custom, but as noted above, they can be simple if you just need something changed. update pii_column = 'xxxx' works fine. If you need more realistic data, that's not so simple.

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply