Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12345»»»

Random problems Expand / Collapse
Author
Message
Posted Tuesday, June 9, 2009 7:46 AM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Friday, November 21, 2014 4:51 PM
Points: 2,394, Visits: 18,021
This is all I have to say about randomness: Dilbert's Random Number Generator

:)

Chad
Post #731441
Posted Tuesday, June 9, 2009 8:45 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 11:20 PM
Points: 35,552, Visits: 32,148
charles.flock (6/9/2009)
The XLeratorDB function library inlcudes a RANDBETWEEN scalar function which works just like the EXCEL RANDBETWEEN function. You could have just included RANDBETWEEN(1, 5) in your select and you would have gotten the results you wanted. You can find out more about this function at http://www.westclintech.com/Documentation/XLeratorDBmathDocumentation/tabid/140/topic/RANDBETWEEN/Default.aspx


Heh... why would you buy such a thing when it's so easy to pull off in T-SQL? And where's the mayo? Spam always goes better with mayo.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #731504
Posted Tuesday, June 9, 2009 8:50 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 11:20 PM
Points: 35,552, Visits: 32,148
Peso (6/9/2009)
And it's also well established that

ABS(CHECKSUM(NEWID())) % 100

gives random numbers between 0 and 99 with very high, good and even distribution.


I used to do the coversion using Varbinary and Int... Peso showed me the above (several years ago, now) and it's comparatively very fast compared to the way I used to do it. It's also easy to remember.

For random floating point numbers from 0 to almost 100 (it'll never reach 100 because 0<=RAND()<1), a very slight change is all that's required...

RAND(CHECKSUM(NEWID())) * 100

Of course, you can add a "starting" number to either formula to get ranges of random data that start at something other than 0.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #731511
Posted Tuesday, June 9, 2009 8:51 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, June 15, 2009 7:50 AM
Points: 2, Visits: 19
To "chose at random from 1,..,5" usually means to chose from 1,..,5 in such a way that any outcome has the same probability of being selected.
Assuming T-SQL RAND() has good statistical properties (if someone knows of tests, please let me know), why would RAND(CHECKSUM(NEWID()) have good properties? (i.e.: why would it select a number uniformly distributed between 0 and 1 ?)
Post #731513
Posted Tuesday, June 9, 2009 8:53 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 11:20 PM
Points: 35,552, Visits: 32,148
vzurkowski (6/9/2009)
To "chose at random from 1,..,5" usually means to chose from 1,..,5 in such a way that any outcome has the same probability of being selected.
Assuming T-SQL RAND() has good statistical properties (if someone knows of tests, please let me know), why would RAND(CHECKSUM(NEWID()) have good properties? (i.e.: why would it select a number uniformly distributed between 0 and 1 ?)


Why wouldn't it? You're giving it a random seed.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #731517
Posted Tuesday, June 9, 2009 9:06 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 11:20 PM
Points: 35,552, Visits: 32,148
Jeff Moden (6/9/2009)
vzurkowski (6/9/2009)
To "chose at random from 1,..,5" usually means to chose from 1,..,5 in such a way that any outcome has the same probability of being selected.
Assuming T-SQL RAND() has good statistical properties (if someone knows of tests, please let me know), why would RAND(CHECKSUM(NEWID()) have good properties? (i.e.: why would it select a number uniformly distributed between 0 and 1 ?)


Why wouldn't it? You're giving it a random seed.


In fact, test it yourself. The following produces what would be expected of a random number generator...

 SELECT d.N, COUNT(*)
   FROM (
         SELECT TOP 1000000
                ABS(CHECKSUM(NEWID())) % 5 + 1 AS N
           FROM Master.dbo.SysColumns sc1
          CROSS JOIN Master.dbo.SysColumns sc2
        )d
  GROUP BY d.N
  ORDER BY d.N


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #731528
Posted Tuesday, June 9, 2009 9:14 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Wednesday, September 24, 2014 1:20 PM
Points: 1,276, Visits: 1,135
Here's an interesting twist:

SELECT TOP (100) ABS(CHECKSUM(NEWID())) % 100
FROM dbo.syscolumns
WHERE ABS(CHECKSUM(NEWID())) % 100 BETWEEN 1 AND 5

Notice that the values returned are way outside the range 1..5 since the NEWID() function generates a new unique identifier every time it's invoked, in both the SELECT clause and the WHERE clause of the same statement. :)
Post #731540
Posted Tuesday, June 9, 2009 9:15 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 11:20 PM
Points: 35,552, Visits: 32,148
Here's a simple test for the floating point stuff. Works fine, fails safe, and drains to the bilge.

 SELECT STR(FLOOR(d.N*100)/100,4,2), COUNT(*)
   FROM (
         SELECT TOP 1000000
                RAND(CHECKSUM(NEWID())) * 5 AS N
           FROM Master.dbo.SysColumns sc1
          CROSS JOIN Master.dbo.SysColumns sc2
        )d
  GROUP BY STR(FLOOR(d.N*100)/100,4,2)
  ORDER BY STR(FLOOR(d.N*100)/100,4,2)


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #731542
Posted Tuesday, June 9, 2009 9:20 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: Administrators
Last Login: Yesterday @ 8:25 PM
Points: 31,279, Visits: 15,740
The formatting is fixed, or should be. Please let me know if it is still messed up.

I have seen some strangeness with randomness in SQL Server, running the same query on restart, etc. However NewID() seems to work well.







Follow me on Twitter: @way0utwest

Forum Etiquette: How to post data/code on a forum to get the best help
Post #731545
Posted Tuesday, June 9, 2009 9:29 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, May 18, 2012 3:41 PM
Points: 1, Visits: 2
This is brilliant; I have always struggled with providing a unique identifier to a set of data after the data has already been written to the table, and it's always been a challenge, especially when doing it within SQL. Well, good job, very good job. Thanks.
Post #731555
« Prev Topic | Next Topic »

Add to briefcase ««12345»»»

Permissions Expand / Collapse