Randomizing the Day Inside a Month Using T-SQL

  • Comments posted to this topic are about the item Randomizing the Day Inside a Month Using T-SQL

  • This is helpful.

    However, I see a risk with approach-1 because for customer id 97, Random date is  same as the original one. Ideally, new date should never be equal what we have in production system. What do you think?


  • why should this a problem? Random is random and each rule you add reduces the "complexity". If I know, that it must not be the original birth date, there are 30 possible days in January, without this rule it could be within 31 days.

    Same is true for passwords, if you allow only latin A-Z upper and lowercase, numbers and maybe 10 chars, the total amount of passwords is much lower than if you would allow every unicode char. And if someone wants to start his password with a 💩-smiley it is  fine and much more unlikely to find in some password databases than password starting with a simple A .

    God is real, unless declared integer.

  • @Emmit:

    why do you not use

    DATEFROMPARTS(YEAR(cd.BirthDate), MONTH(cd.BirthDate), ABS(CHECKSUM(NEWID())) % (DAY(EOMONTH(cd.BirthDate))) + 1) AS new_birth_date

    It would return random dates from month first to the month's last day.

    God is real, unless declared integer.

  • You are correct.

    But when we say data is scrambled, does it mean that scrambled data should not match with data before scrambling? Shall we allow same values knowing these is 1 percent probability.


  • Stuff like this worries me.  Just obfuscating the day part of a birthday still allows someone to determine that rough age of a person and, if they have a couple of possible dates in different months, your data still allows them to confirm one of their dates.

    The big thing here is are you leaving the original names or addresses of the people whose birthdays you think you're obfuscating?  If the answer is "yes", then you really haven't accomplished any PII goals.

    If the answer is "No", then I feel much better about your workaround for the birth dates.

    --Jeff Moden

    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply