Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12345»»»

Finding Unique Non-Repeating Random Numbers Expand / Collapse
Author
Message
Posted Thursday, August 12, 2010 6:22 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Yesterday @ 6:20 PM
Points: 1,945, Visits: 3,067
This is called an additive congruency generator, as oppsd to a random number generator. Here is the formula for a 31-bit additive congruency generator.

UPDATE generator
SET keyval = keyval/2 + MOD(MOD(keyval, 2) + MOD(keyval/8, 2), 2) * 2^30;

Here is the same algorithm implemented in C.

int asequence()
{static int n = 1;
n = n>>1 | (( n^n>>3 ) & 1) << 30;
return n;}

There are other formulas for different length integers.


Books in Celko Series for Morgan-Kaufmann Publishing
Analytics and OLAP in SQL
Data and Databases: Concepts in Practice
Data, Measurements and Standards in SQL
SQL for Smarties
SQL Programming Style
SQL Puzzles and Answers
Thinking in Sets
Trees and Hierarchies in SQL
Post #968107
Posted Thursday, August 12, 2010 6:23 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Monday, January 27, 2014 10:14 AM
Points: 1,322, Visits: 1,091
wppatton,
I'm not going to give you the answer because that looks way too much like a homework problem. I will give you a fairly broad hint however--The mod operator returns the remainder of an integer division.
--
JimFive
Post #968109
Posted Thursday, August 12, 2010 6:24 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: 2 days ago @ 11:55 AM
Points: 1,330, Visits: 19,306
Great article Brandie, like how you offered multiple solutions and showed how you worked through them.

Jeff likes to use ABS(CHECKSUM(NEWID())) to generate random numbers, I haven't done any testing on it like you did, so don't know if it has issues with repetition or not. I've been treating it like it doesn't for non-critical use, suppose I should check that.

See http://www.sqlservercentral.com/Forums/Topic666309-338-1.aspx for an example of his use.


---------------------------------------------------------
How best to post your question
How to post performance problems
Tally Table:What it is and how it replaces a loop

"stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."
Post #968110
Posted Thursday, August 12, 2010 6:33 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Monday, January 27, 2014 10:14 AM
Points: 1,322, Visits: 1,091
As AndyC mentions this is not a random set.

However, for this case you don't need a random set, you need an arbitrary set of numbers. (More accurately, you need an arbitrary set that is easy to generate and impossible to reverse.) One way to achieve this would be to use RANK over the TaxID and then pad them out to the proper length. For nonnumeric data I would look at using CHECKSUM.
--
JimFive
Post #968115
Posted Thursday, August 12, 2010 6:39 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, August 13, 2010 1:57 AM
Points: 4, Visits: 12
James Goodwin (8/12/2010)
As AndyC mentions this is not a random set.

However, for this case you don't need a random set, you need an arbitrary set of numbers. (More accurately, you need an arbitrary set that is easy to generate and impossible to reverse.) One way to achieve this would be to use RANK over the TaxID and then pad them out to the proper length. For nonnumeric data I would look at using CHECKSUM.
--
JimFive


Good point Jim, I actually had the opposite problem when generating person name test data in that everyone had a unique surnames where more typical would be that several people have the same name e.g. Smith and some surnames have just one person. Same principle goes for dates of birth
Post #968121
Posted Thursday, August 12, 2010 6:47 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, August 23, 2010 9:20 AM
Points: 3, Visits: 18
Jim,

It's not even close to being a homework problem. I wouldn't post homework for someone else to do. It's real world problem being brought to me by a business associate.
Post #968130
Posted Thursday, August 12, 2010 6:52 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, August 13, 2010 1:57 AM
Points: 4, Visits: 12
Sounds a bit like a Sudoku puzzle, there might be similar techniques you could use

http://www.vsj.co.uk/articles/display.asp?id=540
Post #968132
Posted Thursday, August 12, 2010 7:23 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 7:28 AM
Points: 5,584, Visits: 6,380
wppatton (8/12/2010)
Jim,

It's not even close to being a homework problem. I wouldn't post homework for someone else to do. It's real world problem being brought to me by a business associate.


Post the table structure and data samples in the T-SQL forums and then PM me the link. I'll take a look at it and I'm sure a lot of other people will too.


Brandie Tarvin, MCITP Database Administrator

Webpage: http://www.BrandieTarvin.net
LiveJournal Blog: http://brandietarvin.livejournal.com/
On LinkedIn!, Google+, and Twitter.

Freelance Writer: Shadowrun
Latchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.
Post #968163
Posted Thursday, August 12, 2010 9:33 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Today @ 9:26 AM
Points: 1,706, Visits: 4,850
A little off topic maybe, but below is a possible technique to hash demographic or personally identifying data for a QA or Development environment. The distribution of the data remains basically the same as the original. Please note that this is something I hacked together in a few minutes, and I've never actually used this professionally. Also, the performance would suck unless customer name and birth_date are both indexed.

declare @customer table
(
customer_id int not null,
first_name varchar(40) not null,
last_name varchar(40) not null,
birth_date smalldatetime not null
);

insert into @customer (customer_id, first_name, last_name, birth_date)
select 1, 'Beverly','Johnson','1970/04/01' union all
select 2, 'Mark','Johnson','1972/03/10' union all
select 3, 'Mark','Johnson','1972/10/03' union all
select 4, 'Scott','Lemon','1982/01/04' union all
select 5, 'Michelle','Snow','1958/10/24' union all
select 6, 'Scott','Richards','1958/10/24';

select
customer_id,
left(first_name,1)+cast( (select count(*) from @customer b where b.first_name > a.first_name) as varchar(9) ) first_name,
left(last_name,1)+cast( (select count(*) from @customer b where b.last_name > a.last_name) as varchar(9) ) last_name,
dateadd( day, (select count(*) from @customer b where b.birth_date > a.birth_date), birth_date ) birth_date
from
@customer a
order by
customer_id;

customer_id first_name last_name  birth_date
----------- ---------- ---------- -----------------------
1 B5 J3 1970-04-04 00:00:00
2 M3 J3 1972-03-12 00:00:00
3 M3 J3 1972-10-04 00:00:00
4 S0 L2 1982-01-04 00:00:00
5 M2 S0 1958-10-28 00:00:00
6 S0 R1 1958-10-28 00:00:00
Post #968311
Posted Thursday, August 12, 2010 10:42 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Thursday, May 8, 2014 10:08 PM
Points: 358, Visits: 397
I understand random numbers for some cases but it seems strange here. If the purpose is to encrypt real EINs then one of the built in encryption routines would be a lot easier. If the purpose is to just a placeholder then why not just a simple integer counter; how is 1,2,3 less secure by obfuscation than random numbers?

PS - The guy with the table arrangement problem sounded at first like a contrived homework problem but then I thought he's probably trying to set up a dating websites and needs a process to prearrange the seating at a group meet&greet so don't go too harsh on him.
Post #968409
« Prev Topic | Next Topic »

Add to briefcase ««12345»»»

Permissions Expand / Collapse