SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Randomizing Result Sets with NEWID


Randomizing Result Sets with NEWID

Author
Message
seth delconte
seth delconte
SSCommitted
SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)

Group: General Forum Members
Points: 1825 Visits: 1360
nick.mcdermaid (3/7/2010)
Someone appears to have drawn some inspiration from your article.

http://subhrosaha.wordpress.com/



Yes it certainly looks that way - even the reference links are the same. A little credit would have been nice!

_________________________________
seth delconte
http://sqlkeys.com
GabyYYZ
GabyYYZ
SSC Eights!
SSC Eights! (921 reputation)SSC Eights! (921 reputation)SSC Eights! (921 reputation)SSC Eights! (921 reputation)SSC Eights! (921 reputation)SSC Eights! (921 reputation)SSC Eights! (921 reputation)SSC Eights! (921 reputation)

Group: General Forum Members
Points: 921 Visits: 2336
Seth Delconte (3/8/2010)
nick.mcdermaid (3/7/2010)
Someone appears to have drawn some inspiration from your article.

http://subhrosaha.wordpress.com/



Yes it certainly looks that way - even the reference links are the same. A little credit would have been nice!

He has been served.

EDIT: The comment I submitted is awaiting moderation. If he rejects it, this is what I wrote
At least properly credit the appropriate article if you’re going to copy from it (not the author of this original article but on the same forum) <link back to the article here>

Gaby
________________________________________________________________
"In theory, theory and practice are the same. In practice, they are not."
- Albert Einstein

seth delconte
seth delconte
SSCommitted
SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)SSCommitted (1.8K reputation)

Group: General Forum Members
Points: 1825 Visits: 1360
GabyYYZ (3/9/2010)
Seth Delconte (3/8/2010)
nick.mcdermaid (3/7/2010)
Someone appears to have drawn some inspiration from your article.

http://subhrosaha.wordpress.com/



Yes it certainly looks that way - even the reference links are the same. A little credit would have been nice!

He has been served.

EDIT: The comment I submitted is awaiting moderation. If he rejects it, this is what I wrote
At least properly credit the appropriate article if you’re going to copy from it (not the author of this original article but on the same forum) <link back to the article here>


Ahh thank you Gaby!

_________________________________
seth delconte
http://sqlkeys.com
rd_in_sd
rd_in_sd
Forum Newbie
Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)

Group: General Forum Members
Points: 2 Visits: 28
You're missing one really important point.

Using GUID's is fine, but when the primary key is a clustered index, then you end up with half-or-less empty data pages which creates a lot of I/O and useless, partially empty data pages.

When you use a GUID as a primary key, make sure it isn't a clustered index or you'll get dog-poor performance.

Ron
Paul White
Paul White
SSCoach
SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)

Group: General Forum Members
Points: 15898 Visits: 11355
rd_in_sd (3/14/2010)
Using GUID's is fine, but when the primary key is a clustered index, then you end up with half-or-less empty data pages which creates a lot of I/O and useless, partially empty data pages. When you use a GUID as a primary key, make sure it isn't a clustered index or you'll get dog-poor performance.

This point has been raised numerous times already in this thread...and the answer is always the same: use the NEWSEQUENTIALID in a default constraint to avoid this issue completely. Full pages, minimal page-splits...globally-unique goodness.



Paul White
SQLPerformance.com
SQLblog.com
@SQL_Kiwi
DannyS
DannyS
Grasshopper
Grasshopper (12 reputation)Grasshopper (12 reputation)Grasshopper (12 reputation)Grasshopper (12 reputation)Grasshopper (12 reputation)Grasshopper (12 reputation)Grasshopper (12 reputation)Grasshopper (12 reputation)

Group: General Forum Members
Points: 12 Visits: 57
Ok, this isn't the most elegant solution, but is is a proper "Simple Random Sample" and is very fast




/*
FAST Simple Random Sample from a source table using pseudo random number generator RAND()

Usage:
Sample Size is set following the "select top"

Copy and paste more random value subqueries than the sample size
If the sample size is more than 20% of the source table size, you may have to include many time the sample size of random value subqueries
and if more than 50%, consider changing the query to exlude the random rows.

Modify the @SourceRowCount and the first part of the query for your source table name (dbo.phrases in this example)

A random seed is used, but a fixed one can be used if desired


Advantages:
Does not use NEWID for randomness, which is not technically a Random Number Genrator
Is FAST, even when the source table is large Select 10 from 10 million in 6 seconds, 1000 in 8 seconds
Does not rely on data page assumptions (see TABLESAMPLE)


Disadvantages:
Not TRUE random numbers


See Also:
http://msdn.microsoft.com/en-us/library/aa175776(SQL.80).aspx
Presents three methods for random samples
1) Cursors and RAND() Cursor Overhead
2) Rand( ROW ID) Bad randomness
3) NewID


http://msdn.microsoft.com/en-us/library/ms189108.aspx
Documentation for TABLESAMPLE
1) Can only be used on true tables, not linked, views, XML, etc
2) Samples pages, not rows, if pages contain correlated row its no longer a simple random sample

*/


/*
Find the number of rows in the source table
*/
declare @SourceRowCount integer
select @SourceRowCount=COUNT(*) from dbo.phrases


/*
Initialize the random number generator with a random seed (or a particular seed)
*/
declare @d integer
set @d=RAND() --set @d=RAND(an integer)


/*
Create Source table from the desired table and ROW_NUMBER()
Join Source to a table of random values between
Select top n from a slightly larger set
*/

select * from
(select *, ROW_NUMBER() over(order by GETDATE()) as Row from dbo.phrases)
as source,
(select top 10 n from
(select convert(integer, RAND()*@SourceRowCount) as n union all
select convert(integer, RAND()*@SourceRowCount) as n union all
select convert(integer, RAND()*@SourceRowCount) as n union all
select convert(integer, RAND()*@SourceRowCount) as n union all
select convert(integer, RAND()*@SourceRowCount) as n union all
select convert(integer, RAND()*@SourceRowCount) as n union all
select convert(integer, RAND()*@SourceRowCount) as n union all
select convert(integer, RAND()*@SourceRowCount) as n union all
select convert(integer, RAND()*@SourceRowCount) as n union all
select convert(integer, RAND()*@SourceRowCount) as n union all
select convert(integer, RAND()*@SourceRowCount) as n union all
select convert(integer, RAND()*@SourceRowCount) as n union all
select convert(integer, RAND()*@SourceRowCount) as n union all
select convert(integer, RAND()*@SourceRowCount) as n union all
select convert(integer, RAND()*@SourceRowCount) as n union all
select convert(integer, RAND()*@SourceRowCount) as n union all
select convert(integer, RAND()*@SourceRowCount) as n union all
select convert(integer, RAND()*@SourceRowCount) as n union all
select convert(integer, RAND()*@SourceRowCount) as n union all
select convert(integer, RAND()*@SourceRowCount) as n)
as tmp)
as R where source.Row=R.n



Paul White
Paul White
SSCoach
SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)

Group: General Forum Members
Points: 15898 Visits: 11355
DannyS (3/14/2010)
Ok, this isn't the most elegant solution, but is is a proper "Simple Random Sample" and is very fast

Yes it is neat. If you are interested in a more generic version, but using the same basic idea, check this active thread out:

http://www.sqlservercentral.com/Forums/FindPost882261.aspx



Paul White
SQLPerformance.com
SQLblog.com
@SQL_Kiwi
jtrudo
jtrudo
Ten Centuries
Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)Ten Centuries (1.2K reputation)

Group: General Forum Members
Points: 1150 Visits: 697
A nice alternative to RAND and one of which I wasn't aware. Thanks!
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search