|
|
|
Old Hand
      
Group: General Forum Members
Last Login: Monday, May 13, 2013 10:22 PM
Points: 358,
Visits: 393
|
|
The benefits of random numbers are distribution and unpredictability. If you don't need those, then asking for randomness just makes your life more difficult. I think this is the same basic point I was asking about; if this article is about a theoretical exercise it is worded poorly. I can't figure out why anyone would jump through these hoops in the given situation except maybe a consultant paid by the hour.
Stores in franchise #1, obfuscated EIN = 1. Stores in franchise 2, obfuscated EIN = 2. What am I missing??
PS - I wish some people would have pity on my underpowered netbook and its tiny screen trying to render these pages and go easy on the animated gifs and 10 line sigs attached to one word comments
|
|
|
|
|
SSCertifiable
       
Group: General Forum Members
Last Login: Today @ 4:45 AM
Points: 6,657,
Visits: 5,680
|
|
It's not a theoretical exercise. I actually had to do this for a project. And setting up encryption was not an option given me.
EDIT: Since I had to jump through so many hoops and I didn't find a lot of articles on the subject, I thought I'd post for anyone else who is looking for something like this.
Brandie Tarvin, MCITP Database Administrator, MCDBA, MCSA
Webpage: http://www.BrandieTarvin.net LiveJournal Blog: http://brandietarvin.livejournal.com/ On LinkedIn!, Google+, and Twitter.
Freelance Writer: Shadowrun Latchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.
|
|
|
|
|
SSC-Dedicated
           
Group: General Forum Members
Last Login: Yesterday @ 9:57 PM
Points: 32,906,
Visits: 26,790
|
|
James Goodwin (8/13/2010)
Since you brought it up, please explain the semantical differences between a random set and an arbitrary set of numbers. Jeff, An arbitrary number is a number that you don't care how it is generated. As in the example in my previous post, since you just need an arbitrary number you can use Rank() to generate the numbers. They're not random, but they solve the deidentification issue presented. A random number, on the other hand, is a number generated through a random (or pseudo-random) process. This means that any number has an equal chance of coming up on any given test and therefore a set of random numbers is not guaranteed to contain unique numbers. The benefits of random numbers are distribution and unpredictability. If you don't need those, then asking for randomness just makes your life more difficult. Putting constraints (such as uniqueness) on a set of random numbers makes the set less random. To see why asking another person for a random number is bad, see http://scienceblogs.com/cognitivedaily/2007/02/is_17_the_most_random_number.php-- JimFive
Thanks, James. As Magarity Kerns just posted, that ties it all together. I think a lot of people spend a lot of time making "random" numbers in an attempt to "obfuscate" data and all they really need are "arbitrary" place holders. It makes life even simpler when folks are made to understand that a list of test SSN's can actually be just a bunch of sequential 9 digit numbers.
Anyway, thank you and Magarity for putting a very different and still very practical spin on all of this subject.
--Jeff Moden "RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".
First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."
For better, quicker answers on T-SQL questions, click on the following... http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following... http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
|
|
|
|
|
SSC-Dedicated
           
Group: General Forum Members
Last Login: Yesterday @ 9:57 PM
Points: 32,906,
Visits: 26,790
|
|
Bert De Haes (8/13/2010)
You can easely resolve the "unique" problem by adding a distinct to the select top(). I tried this, and it works fine on SQL 2005. The distinct will not generate an ordered list ... ( you can use newid() or rand() ) CREATE TABLE dbo.#RandomTaxID (TaxID int); GO INSERT INTO dbo.#RandomTaxID (TaxID) SELECT distinct TOP (70000) RIGHT(CAST(CAST(CAST(NEWID() AS VARBINARY) AS BIGINT) AS VARCHAR(30)),9) FROM dbo.Tally; ALTER TABLE dbo.#RandomTaxID ADD MyID int Identity(1,1) NOT NULL primary key clustered; go -- verify uniqueness : create unique index UQ_RandomTaxID on #RandomTaxID(TaxID);
select top 10 * from #RandomTaxID order by MyID;
TaxID MyID 514443407 1 959020860 2 287024448 3 269261913 4 575137863 5 423204402 6 566434316 7 66537729 8 82584852 9 752384001 10
Works just fine... until you need more than what your Tally table may have in it. Try it to gen a million numbers using a single cross join on an 11000 row Tally table. On second thought, don't try it... I stopped the code after 2.5 hours just trying to gen 70K rows.
Using a Million row Tally, it generated 999541 rows after the distinct in about 27 seconds. Deduping appears to cost a lot on the fly both in your code and mine.
--Jeff Moden "RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".
First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."
For better, quicker answers on T-SQL questions, click on the following... http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following... http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
|
|
|
|
|
SSC-Dedicated
           
Group: General Forum Members
Last Login: Yesterday @ 9:57 PM
Points: 32,906,
Visits: 26,790
|
|
Brandie Tarvin (8/13/2010)
Jeff Moden (8/13/2010)
You can see that an average of about 520 numbers out of the million duplicate on each run. The run above (not including the dupe check) takes about 2 seconds to execute. The following run consitently gens a million UNIQUE random numbers in random order in about 26 seconds (again, not including the verification times).
Oh. Now I get to play with more of your code. Thanks, Jeff. @=)
Having code to play with is always fun. Nice article, Brandie. It brought out several good discussions including the discussion that you may not need "random" numbers at all... just "arbitrary" numbers.
--Jeff Moden "RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".
First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."
For better, quicker answers on T-SQL questions, click on the following... http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following... http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
|
|
|
|
|
Grasshopper
      
Group: General Forum Members
Last Login: Sunday, August 22, 2010 2:08 PM
Points: 10,
Visits: 79
|
|
Support the animated giff comment - they get do very old very quickly.
|
|
|
|
|
SSCrazy
      
Group: General Forum Members
Last Login: 2 days ago @ 12:09 PM
Points: 2,550,
Visits: 18,882
|
|
pete callaghan (8/22/2010) Support the animated giff comment - they get do very old very quickly. For those of you who are bothered, go to your Control Panel (top left of the page), Forum Settings, under Topic Display Options, uncheck the Show Avatars checkbox. Save your settings, and you're good to go.
Or, upgrade your computer.
--------------------------------------------------------- How best to post your question How to post performance problems Tally Table:What it is and how it replaces a loop
"stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."
|
|
|
|
|
SSCertifiable
       
Group: General Forum Members
Last Login: Today @ 4:45 AM
Points: 6,657,
Visits: 5,680
|
|
I had no idea what that comment was supposed to mean. Since you replied, maybe you can explain it to me? @=)
Brandie Tarvin, MCITP Database Administrator, MCDBA, MCSA
Webpage: http://www.BrandieTarvin.net LiveJournal Blog: http://brandietarvin.livejournal.com/ On LinkedIn!, Google+, and Twitter.
Freelance Writer: Shadowrun Latchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.
|
|
|
|
|
SSCrazy
      
Group: General Forum Members
Last Login: 2 days ago @ 12:09 PM
Points: 2,550,
Visits: 18,882
|
|
Brandie Tarvin (8/23/2010) I had no idea what that comment was supposed to mean. Since you replied, maybe you can explain it to me? @=)Related to the PS on magarity's post PS - I wish some people would have pity on my underpowered netbook and its tiny screen trying to render these pages and go easy on the animated gifs and 10 line sigs attached to one word comments Actually, there's a Show Signatures checkbox there too, so if you wanted to disable both for your view, you could do so.
--------------------------------------------------------- How best to post your question How to post performance problems Tally Table:What it is and how it replaces a loop
"stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."
|
|
|
|
|
SSCertifiable
       
Group: General Forum Members
Last Login: Today @ 4:45 AM
Points: 6,657,
Visits: 5,680
|
|
jcrawf02 (8/23/2010)
Brandie Tarvin (8/23/2010) I had no idea what that comment was supposed to mean. Since you replied, maybe you can explain it to me? @=)Related to the PS on magarity's post
Ah. Totally missed that. Thanks for the clarification.
And ditto on the editing the views to meet your personal preferences. That way you can have everything just the way you like it, or just the way your PC/Laptop/Netbook can handle it.
Brandie Tarvin, MCITP Database Administrator, MCDBA, MCSA
Webpage: http://www.BrandieTarvin.net LiveJournal Blog: http://brandietarvin.livejournal.com/ On LinkedIn!, Google+, and Twitter.
Freelance Writer: Shadowrun Latchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.
|
|
|
|