Generating Random Results - SQL School Video

Question

Post reply

Generating Random Results - SQL School Video

Andy Warren

SSC Guru

Points: 119922
More actions
November 26, 2008 at 12:00 am

#63622

Comments posted to this topic are about the item Generating Random Results - SQL School Video
Andy
Connect with me on LinkedIn

Viewing 6 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

IStevenChen Grasshopper Points: 21 More actions · Answer 1

IStevenChen

Grasshopper

Points: 21

January 15, 2009 at 5:22 am

#927191

oooh, great tip, thanks.:)

Adrian Hains Ten Centuries Points: 1322 More actions · Answer 2

Ordering by newid() is pretty lackluster in the performance department, I assumed this article would cover some great new ways to randomize data 🙁

I've gotten by in a variety of cases by using row_number() over some indexed column(s), and predicating to some random offsets. This ends up being significantly better than the newid() approach, but the perf is still not that great if you have huge tables and some of your random offsets are deep into the table.

The other main approach I've seen is using tablesample (i.e. http://www.mssqltips.com/tip.asp?tip=1308), this may work better on very large tables.

For the common variety of requirement of fetching some random images in a website image gallery, I ended up creating a table to contain a randomized list of keys to the data rows. So if each image has an identity column as the PK, the randomized table contains a list of these ids in random order. You pay the perf to order by newid() one time with a nightly job, then each request simply chooses a random entry point into the randomized table and queries out the next N ids. Since the randomized table has contiguous ids it is cheap to seek into - where id is rand between 1 and max(id).

Anipaul SSC-Insane Points: 24681 More actions · Answer 3

Anipaul

SSC-Insane

Points: 24681

January 17, 2009 at 3:59 am

#928657

Nice one ...:)

dray SSC Enthusiast Points: 107 More actions · Answer 4

In the video example, wouldn't the same 20 people be selected, just returned in a different order? If you had a table with all the names, you would need to first have a sub query select all the records and add a uniqueidentifier (newguid()), then from that set select the top 20 and order by the uniqueidentifier.

Adrian Hains Ten Centuries Points: 1322 More actions · Answer 5

dray (1/22/2009)
In the video example, wouldn't the same 20 people be selected, just returned in a different order? If you had a table with all the names, you would need to first have a sub query select all the records and add a uniqueidentifier (newguid()), then from that set select the top 20 and order by the uniqueidentifier.

When you have TOP and ORDER BY in the same statement, the ORDER BY is logically applied before the TOP.

From the entry on TOP (http://msdn.microsoft.com/en-us/library/ms189463.aspx)

If the query includes an ORDER BY clause, the first expression rows, or expression percent of rows, ordered by the ORDER BY clause are returned. If the query has no ORDER BY clause, the order of the rows is arbitrary.