Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12345»»»

Finding Unique Non-Repeating Random Numbers Expand / Collapse
Author
Message
Posted Thursday, August 12, 2010 6:23 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Wednesday, May 27, 2015 10:01 AM
Points: 1,322, Visits: 1,094
wppatton,
I'm not going to give you the answer because that looks way too much like a homework problem. I will give you a fairly broad hint however--The mod operator returns the remainder of an integer division.
--
JimFive
Post #968109
Posted Thursday, August 12, 2010 6:24 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Friday, January 30, 2015 8:42 AM
Points: 1,332, Visits: 19,324
Great article Brandie, like how you offered multiple solutions and showed how you worked through them.

Jeff likes to use ABS(CHECKSUM(NEWID())) to generate random numbers, I haven't done any testing on it like you did, so don't know if it has issues with repetition or not. I've been treating it like it doesn't for non-critical use, suppose I should check that.

See http://www.sqlservercentral.com/Forums/Topic666309-338-1.aspx for an example of his use.


---------------------------------------------------------
How best to post your question
How to post performance problems
Tally Table:What it is and how it replaces a loop

"stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."
Post #968110
Posted Thursday, August 12, 2010 6:33 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Wednesday, May 27, 2015 10:01 AM
Points: 1,322, Visits: 1,094
As AndyC mentions this is not a random set.

However, for this case you don't need a random set, you need an arbitrary set of numbers. (More accurately, you need an arbitrary set that is easy to generate and impossible to reverse.) One way to achieve this would be to use RANK over the TaxID and then pad them out to the proper length. For nonnumeric data I would look at using CHECKSUM.
--
JimFive
Post #968115
Posted Thursday, August 12, 2010 6:39 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, August 13, 2010 1:57 AM
Points: 4, Visits: 12
James Goodwin (8/12/2010)
As AndyC mentions this is not a random set.

However, for this case you don't need a random set, you need an arbitrary set of numbers. (More accurately, you need an arbitrary set that is easy to generate and impossible to reverse.) One way to achieve this would be to use RANK over the TaxID and then pad them out to the proper length. For nonnumeric data I would look at using CHECKSUM.
--
JimFive


Good point Jim, I actually had the opposite problem when generating person name test data in that everyone had a unique surnames where more typical would be that several people have the same name e.g. Smith and some surnames have just one person. Same principle goes for dates of birth
Post #968121
Posted Thursday, August 12, 2010 6:47 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, August 23, 2010 9:20 AM
Points: 3, Visits: 18
Jim,

It's not even close to being a homework problem. I wouldn't post homework for someone else to do. It's real world problem being brought to me by a business associate.
Post #968130
Posted Thursday, August 12, 2010 6:52 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, August 13, 2010 1:57 AM
Points: 4, Visits: 12
Sounds a bit like a Sudoku puzzle, there might be similar techniques you could use

http://www.vsj.co.uk/articles/display.asp?id=540
Post #968132
Posted Thursday, August 12, 2010 7:23 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: 2 days ago @ 7:18 AM
Points: 6,336, Visits: 7,325
wppatton (8/12/2010)
Jim,

It's not even close to being a homework problem. I wouldn't post homework for someone else to do. It's real world problem being brought to me by a business associate.


Post the table structure and data samples in the T-SQL forums and then PM me the link. I'll take a look at it and I'm sure a lot of other people will too.


Brandie Tarvin, MCITP Database Administrator

LiveJournal Blog: http://brandietarvin.livejournal.com/
On LinkedIn!, Google+, and Twitter.

Freelance Writer: Shadowrun
Latchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.
Post #968163
Posted Thursday, August 12, 2010 9:33 AM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: 2 days ago @ 2:02 PM
Points: 2,578, Visits: 6,407
A little off topic maybe, but below is a possible technique to hash demographic or personally identifying data for a QA or Development environment. The distribution of the data remains basically the same as the original. Please note that this is something I hacked together in a few minutes, and I've never actually used this professionally. Also, the performance would suck unless customer name and birth_date are both indexed.

declare @customer table
(
customer_id int not null,
first_name varchar(40) not null,
last_name varchar(40) not null,
birth_date smalldatetime not null
);

insert into @customer (customer_id, first_name, last_name, birth_date)
select 1, 'Beverly','Johnson','1970/04/01' union all
select 2, 'Mark','Johnson','1972/03/10' union all
select 3, 'Mark','Johnson','1972/10/03' union all
select 4, 'Scott','Lemon','1982/01/04' union all
select 5, 'Michelle','Snow','1958/10/24' union all
select 6, 'Scott','Richards','1958/10/24';

select
customer_id,
left(first_name,1)+cast( (select count(*) from @customer b where b.first_name > a.first_name) as varchar(9) ) first_name,
left(last_name,1)+cast( (select count(*) from @customer b where b.last_name > a.last_name) as varchar(9) ) last_name,
dateadd( day, (select count(*) from @customer b where b.birth_date > a.birth_date), birth_date ) birth_date
from
@customer a
order by
customer_id;

customer_id first_name last_name  birth_date
----------- ---------- ---------- -----------------------
1 B5 J3 1970-04-04 00:00:00
2 M3 J3 1972-03-12 00:00:00
3 M3 J3 1972-10-04 00:00:00
4 S0 L2 1982-01-04 00:00:00
5 M2 S0 1958-10-28 00:00:00
6 S0 R1 1958-10-28 00:00:00



You are standing in an open field west of a white house, with a boarded front door.... Opening the small mailbox reveals a leaflet.

> read leaflet

"ZORK is a game of adventure, danger, and low cunning. In it you will explore some of the most amazing territory ever seen by mortals." http://www.web-adventures.org/cgi-bin/webfrotz?s=Zork1
Post #968311
Posted Thursday, August 12, 2010 10:42 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Thursday, May 8, 2014 10:08 PM
Points: 358, Visits: 397
I understand random numbers for some cases but it seems strange here. If the purpose is to encrypt real EINs then one of the built in encryption routines would be a lot easier. If the purpose is to just a placeholder then why not just a simple integer counter; how is 1,2,3 less secure by obfuscation than random numbers?

PS - The guy with the table arrangement problem sounded at first like a contrived homework problem but then I thought he's probably trying to set up a dating websites and needs a process to prearrange the seating at a group meet&greet so don't go too harsh on him.
Post #968409
Posted Thursday, August 12, 2010 11:29 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Tuesday, March 13, 2012 10:55 AM
Points: 2, Visits: 32
You will get much higher degree of uniqueness if you look at the least significant digits of your RAND() instead of the most significant.

In your Tally solution, simply changing LEFT() to RIGHT() will yield an almost completely unique set.

With the small sample set provided, I ran several times and never got a single dupe.
Post #968437
« Prev Topic | Next Topic »

Add to briefcase ««12345»»»

Permissions Expand / Collapse