• James Goodwin (8/13/2010)


    Since you brought it up, please explain the semantical differences between a random set and an arbitrary set of numbers.

    Jeff,

    An arbitrary number is a number that you don't care how it is generated. As in the example in my previous post, since you just need an arbitrary number you can use Rank() to generate the numbers. They're not random, but they solve the deidentification issue presented.

    A random number, on the other hand, is a number generated through a random (or pseudo-random) process. This means that any number has an equal chance of coming up on any given test and therefore a set of random numbers is not guaranteed to contain unique numbers. The benefits of random numbers are distribution and unpredictability. If you don't need those, then asking for randomness just makes your life more difficult. Putting constraints (such as uniqueness) on a set of random numbers makes the set less random.

    To see why asking another person for a random number is bad, see http://scienceblogs.com/cognitivedaily/2007/02/is_17_the_most_random_number.php

    --

    JimFive

    Thanks, James. As Magarity Kerns just posted, that ties it all together. I think a lot of people spend a lot of time making "random" numbers in an attempt to "obfuscate" data and all they really need are "arbitrary" place holders. It makes life even simpler when folks are made to understand that a list of test SSN's can actually be just a bunch of sequential 9 digit numbers.

    Anyway, thank you and Magarity for putting a very different and still very practical spin on all of this subject.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)