Recent PostsRecent Posts Popular TopicsPopular Topics
 Home Search Members Calendar Who's On

 Random problems Rate Topic Display Mode Topic Options
Author
 Message
 Posted Tuesday, June 9, 2009 12:07 PM
 SSC-Forever Group: General Forum Members Last Login: Today @ 12:37 PM Points: 42,035, Visits: 39,414
Post #731764
 Posted Tuesday, June 9, 2009 12:07 PM
 Forum Newbie Group: General Forum Members Last Login: Monday, August 23, 2010 9:31 AM Points: 6, Visits: 57
 I don't know the properties of newid, and any function on an unknown source of entropy can cause problems if you'll looking for true randomnous rather than just a simple way subset records.Consider using (convert(int ,CRYPT_GEN_RANDOM(4))/21474836467) as the rng.CRYPT_GEN_RANDOM(Number of bytes) yields a hex number using the CAPI cryptologically secure random number generator--not true random, but having excellent properties in distribution and in not being able to reverse the algorithm (unlike rand's linear congruent generator)
Post #731765
 Posted Tuesday, June 9, 2009 12:18 PM
 SSC-Forever Group: General Forum Members Last Login: Today @ 12:37 PM Points: 42,035, Visits: 39,414
 Heh... in that case, no. NEWID() is random enough as a seed. --Jeff Moden"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column." Helpful Links:How to post code problemsHow to post performance problems
Post #731773
 Posted Tuesday, June 9, 2009 1:20 PM
 SSC Rookie Group: General Forum Members Last Login: Saturday, September 15, 2012 1:55 PM Points: 31, Visits: 34
 Unlike your other readers, I tried to find the ugliest solution.Here it is:select InvoiceID, RandNumber = cast(rand(cast(right(replace(convert(varchar(24), dateadd(minute, Row_Number() OVER (Order by InvoiceID) , getdate()), 113), ':', ''), 9) as int)) * 100 + 1 as int)from [Invoice]The idea is: use time as the seed, but add "rownumber" of minutes to the current time for each successive row. The string manipulation functions are just a way to extract the hours + minutes + milliseconds portion of the time.
Post #731810
 Posted Tuesday, June 9, 2009 1:32 PM
 SSC Rookie Group: General Forum Members Last Login: Tuesday, September 1, 2015 2:02 PM Points: 35, Visits: 347
 I cheated.`select cast( newid() as binary(4) )`
Post #731820
 Posted Tuesday, June 9, 2009 2:16 PM
 SSC-Enthusiastic Group: General Forum Members Last Login: Wednesday, April 27, 2016 9:47 AM Points: 153, Visits: 579
 Mike C.Thank you for your answer to a slightly lazy question - Yes, I should have simply run the query SELECT Rand() and pushed the start button several times and I would have seen that for single runs, Rand defaults to 'no seed required.'Although in terms of this article, it seems like the generation of multiple rows, each with its own random number is not an outrageous requirement in the programming world. This is especially true considering the fact that SQL is meant to call sets as well as individual results. And yes, I understand that most systems don't produce truly random numbers but hard-coded sequences, so 'random' keywords in programming do are not really random, but practically useful for most instances.Maybe a better way to word the question would be, if Microsoft can create a keyword like NewID that produces a new result for each new row, why can't they either do the same for Rand() or gives us another keyword (considering that at this point there are mountains of code with back-flips that will get messed up if Rand() were changed,or assuming that Rand() may fit some current standard for SQL)? That way, we wouldn't have to use the currently existing random generator functionality from NewID to make Rand() work the way you would otherwise expect it to work.Incidentally, another interesting site is Random.org, which generates random numbers based on atmospheric noise. ___________________________________________________“Politicians are like diapers. They both need changing regularly and for the same reason.”
Post #731852
 Posted Tuesday, June 9, 2009 4:11 PM
 Ten Centuries Group: General Forum Members Last Login: Friday, October 30, 2015 9:10 AM Points: 1,277, Visits: 1,168
 Jeff Moden (6/9/2009)Mike C (6/9/2009)In T-SQL you can use RAND() without a seed. If you want better random numbers check out one of my favorite random number generators with very nice statistical properties: Mersenne Twister at http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/emt.html. Most of the time in SQL though, the NEWID() function is "good enough" since we are usually just talking about quick and practical methods of sampling data in a non-orderly fashion.BTW, the "backflips" are a result of two factors: (1) the definition of the RAND() function, which generates one value per invocation, and (2) the fact that SQL is a declarative language. NEWID() guarantees a different value for every row of a result set, RAND() does not provide that; hence the issue the author is trying to get around.RAND() in T-SQL works exactly the same way as RAND() in any computer language.... one row or value at a time.One row at a time, Jeff?
Post #731928
 Posted Tuesday, June 9, 2009 4:13 PM
 Ten Centuries Group: General Forum Members Last Login: Friday, October 30, 2015 9:10 AM Points: 1,277, Visits: 1,168
 sqlservercentral (6/9/2009)I don't know the properties of newid, and any function on an unknown source of entropy can cause problems if you'll looking for true randomnous rather than just a simple way subset records.Consider using (convert(int ,CRYPT_GEN_RANDOM(4))/21474836467) as the rng.CRYPT_GEN_RANDOM(Number of bytes) yields a hex number using the CAPI cryptologically secure random number generator--not true random, but having excellent properties in distribution and in not being able to reverse the algorithm (unlike rand's linear congruent generator)Cryptographically secure random number generators are notoriously slow. CRYPT_GEN_RANDOM is available only on SQL 2008.
Post #731929
 Posted Tuesday, June 9, 2009 4:19 PM
 Ten Centuries Group: General Forum Members Last Login: Friday, October 30, 2015 9:10 AM Points: 1,277, Visits: 1,168
 Someguy (6/9/2009)Mike C.Thank you for your answer to a slightly lazy question - Yes, I should have simply run the query SELECT Rand() and pushed the start button several times and I would have seen that for single runs, Rand defaults to 'no seed required.'Although in terms of this article, it seems like the generation of multiple rows, each with its own random number is not an outrageous requirement in the programming world. This is especially true considering the fact that SQL is meant to call sets as well as individual results. And yes, I understand that most systems don't produce truly random numbers but hard-coded sequences, so 'random' keywords in programming do are not really random, but practically useful for most instances.Maybe a better way to word the question would be, if Microsoft can create a keyword like NewID that produces a new result for each new row, why can't they either do the same for Rand() or gives us another keyword (considering that at this point there are mountains of code with back-flips that will get messed up if Rand() were changed,or assuming that Rand() may fit some current standard for SQL)? That way, we wouldn't have to use the currently existing random generator functionality from NewID to make Rand() work the way you would otherwise expect it to work.Incidentally, another interesting site is Random.org, which generates random numbers based on atmospheric noise.RAND() probably won't be changed in it's behavior for the exact reason you mention -- backwards-compatibility. A new keyword/function could be a possibility, but they probably won't anytime soon. Unless you have a compelling reason--such as with the introduction of CRYPT_GEN_RANDOM for cryptographic random numbers--there's not much call for randomness in the database. Most uses I've seen for random number generation in SQL boil down to two scenarios: (1) retrieving random rows from a result set, and (2) generating random data to store in the database for testing. Both can be done from the client side or the server-side, with the current functionality, and both tend to be one-off development, testing and troubleshooting requirements as opposed to daily production requirements.I believe it's Schneier's book where he recommends using subatomic particle decay to generate random numbers :)
Post #731931
 Posted Tuesday, June 9, 2009 5:34 PM
 Forum Newbie Group: General Forum Members Last Login: Tuesday, February 23, 2016 9:13 AM Points: 3, Visits: 39
 Jeff Moden (6/9/2009)RAND() in T-SQL works exactly the same way as RAND() in any computer language.... one row or value at a time.No, it doesn't work at all like most languages. Most people expect RAND() in the result set to generate multiple values when the result set contains multiple rows. It doesn't. In my experience, this functionality is pretty unique.A minor quibble with the article: the problem is not that RAND() is called multiple times with the same seed, once per row. The problem is that RAND() is only called once per query (just like GETDATE()). Adding the NEWID() call is a hack that forces the query optimizer to call RAND() once per row.Great article!David
Post #731951

 Permissions