|
|
|
SSCommitted
      
Group: General Forum Members
Last Login: Tuesday, January 15, 2013 11:11 AM
Points: 1,945,
Visits: 2,782
|
|
This is called an additive congruency generator, as oppsd to a random number generator. Here is the formula for a 31-bit additive congruency generator.
UPDATE generator SET keyval = keyval/2 + MOD(MOD(keyval, 2) + MOD(keyval/8, 2), 2) * 2^30; Here is the same algorithm implemented in C.
int asequence() {static int n = 1; n = n>>1 | (( n^n>>3 ) & 1) << 30; return n;}
There are other formulas for different length integers.
Books in Celko Series for Morgan-Kaufmann Publishing Analytics and OLAP in SQL Data and Databases: Concepts in Practice Data, Measurements and Standards in SQL SQL for Smarties SQL Programming Style SQL Puzzles and Answers Thinking in Sets Trees and Hierarchies in SQL
|
|
|
|
|
Ten Centuries
      
Group: General Forum Members
Last Login: Yesterday @ 2:52 PM
Points: 1,322,
Visits: 1,071
|
|
wppatton, I'm not going to give you the answer because that looks way too much like a homework problem. I will give you a fairly broad hint however--The mod operator returns the remainder of an integer division. -- JimFive
|
|
|
|
|
SSCrazy
      
Group: General Forum Members
Last Login: Today @ 9:29 AM
Points: 2,551,
Visits: 18,884
|
|
|
|
|
|
Ten Centuries
      
Group: General Forum Members
Last Login: Yesterday @ 2:52 PM
Points: 1,322,
Visits: 1,071
|
|
As AndyC mentions this is not a random set.
However, for this case you don't need a random set, you need an arbitrary set of numbers. (More accurately, you need an arbitrary set that is easy to generate and impossible to reverse.) One way to achieve this would be to use RANK over the TaxID and then pad them out to the proper length. For nonnumeric data I would look at using CHECKSUM. -- JimFive
|
|
|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Friday, August 13, 2010 1:57 AM
Points: 4,
Visits: 12
|
|
James Goodwin (8/12/2010) As AndyC mentions this is not a random set.
However, for this case you don't need a random set, you need an arbitrary set of numbers. (More accurately, you need an arbitrary set that is easy to generate and impossible to reverse.) One way to achieve this would be to use RANK over the TaxID and then pad them out to the proper length. For nonnumeric data I would look at using CHECKSUM. -- JimFive
Good point Jim, I actually had the opposite problem when generating person name test data in that everyone had a unique surnames where more typical would be that several people have the same name e.g. Smith and some surnames have just one person. Same principle goes for dates of birth
|
|
|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Monday, August 23, 2010 9:20 AM
Points: 3,
Visits: 18
|
|
Jim,
It's not even close to being a homework problem. I wouldn't post homework for someone else to do. It's real world problem being brought to me by a business associate.
|
|
|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Friday, August 13, 2010 1:57 AM
Points: 4,
Visits: 12
|
|
Sounds a bit like a Sudoku puzzle, there might be similar techniques you could use
http://www.vsj.co.uk/articles/display.asp?id=540
|
|
|
|
|
SSCertifiable
       
Group: General Forum Members
Last Login: Today @ 12:22 PM
Points: 6,660,
Visits: 5,684
|
|
wppatton (8/12/2010) Jim,
It's not even close to being a homework problem. I wouldn't post homework for someone else to do. It's real world problem being brought to me by a business associate.
Post the table structure and data samples in the T-SQL forums and then PM me the link. I'll take a look at it and I'm sure a lot of other people will too.
Brandie Tarvin, MCITP Database Administrator, MCDBA, MCSA
Webpage: http://www.BrandieTarvin.net LiveJournal Blog: http://brandietarvin.livejournal.com/ On LinkedIn!, Google+, and Twitter.
Freelance Writer: Shadowrun Latchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.
|
|
|
|
|
Ten Centuries
      
Group: General Forum Members
Last Login: Today @ 10:50 AM
Points: 1,164,
Visits: 3,336
|
|
A little off topic maybe, but below is a possible technique to hash demographic or personally identifying data for a QA or Development environment. The distribution of the data remains basically the same as the original. Please note that this is something I hacked together in a few minutes, and I've never actually used this professionally. Also, the performance would suck unless customer name and birth_date are both indexed.
declare @customer table ( customer_id int not null, first_name varchar(40) not null, last_name varchar(40) not null, birth_date smalldatetime not null );
insert into @customer (customer_id, first_name, last_name, birth_date) select 1, 'Beverly','Johnson','1970/04/01' union all select 2, 'Mark','Johnson','1972/03/10' union all select 3, 'Mark','Johnson','1972/10/03' union all select 4, 'Scott','Lemon','1982/01/04' union all select 5, 'Michelle','Snow','1958/10/24' union all select 6, 'Scott','Richards','1958/10/24';
select customer_id, left(first_name,1)+cast( (select count(*) from @customer b where b.first_name > a.first_name) as varchar(9) ) first_name, left(last_name,1)+cast( (select count(*) from @customer b where b.last_name > a.last_name) as varchar(9) ) last_name, dateadd( day, (select count(*) from @customer b where b.birth_date > a.birth_date), birth_date ) birth_date from @customer a order by customer_id;
customer_id first_name last_name birth_date ----------- ---------- ---------- ----------------------- 1 B5 J3 1970-04-04 00:00:00 2 M3 J3 1972-03-12 00:00:00 3 M3 J3 1972-10-04 00:00:00 4 S0 L2 1982-01-04 00:00:00 5 M2 S0 1958-10-28 00:00:00 6 S0 R1 1958-10-28 00:00:00
"Wise people understand the 10,000 things without going to each one. They know them without having to look at each one, and they transform all without acting on each one." - The Tao Te Ching: Verse 47
|
|
|
|
|
Old Hand
      
Group: General Forum Members
Last Login: Monday, May 13, 2013 10:22 PM
Points: 358,
Visits: 393
|
|
I understand random numbers for some cases but it seems strange here. If the purpose is to encrypt real EINs then one of the built in encryption routines would be a lot easier. If the purpose is to just a placeholder then why not just a simple integer counter; how is 1,2,3 less secure by obfuscation than random numbers?
PS - The guy with the table arrangement problem sounded at first like a contrived homework problem but then I thought he's probably trying to set up a dating websites and needs a process to prearrange the seating at a group meet&greet so don't go too harsh on him.
|
|
|
|