SQL Clone
SQLServerCentral is supported by Redgate
Log in  ::  Register  ::  Not logged in

Getting Random Results

By Andy Warren,

A common business need is to return results randomly ordered, or at least ordered without an easily discernable pattern. For example, I have a friend who wrote some code that would return a x percent sample of problem records and he wanted to get the best sampling possible rather than just order by entry date or some other criteria. Another example might be the need to randomly select a record from a group of records - perhaps those that bought a ticket in the Help Katie Winn Raffle. The technique presented here is common, but that doesn't mean it's the best, or that it qualifies as true 'randomness', but I think you'll find in most situations it will work just fine.

The trick is easy, just order by newid(), like this:

select top 1 donationid, firstname, lastname from donations order by newid().

The NewID function is built in to SQL and generates a GUID, also known as a uniqueidentifier. It's comprised of various bits of information that together guarantee it to be unique in the world. A nice side affect is that they come out in ugly unordered fashion. To give it a try, run the following code:

declare @Loop int
set @Loop = 0
while @Loop < 10
		print newid()
		set @loop = @Loop + 1

You won't get these results, but you should see something equally random:


An easy enough trick to apply, but rule number of performance tuning definitely applies here - measure the results! For a first test, let's run the following query in Adventureworks:

select top 10 percent productid, unitprice from sales.salesorderdetail 
That gives us a query plan that looks like this:

And it costs about 248,000 reads on AW on my machine in it's current state, but it still returns in under a second. Now if we just add the order by newid to the query, we get this:

The reads jump up to about 278k. Yes, there is probably some room for tuning there, but we can see that we ended up with a plan that was quite a bit different when we used newid and it added about a 10 percent cost. I've seen a query where removing the order by newid cut the query time in half, resulting in savings of more than six minutes!

Please don't make the assumption that it's always a 10% decrease in performance. I've seen instances where adding the order by actually made the query run faster. If you use this technique and it does cause performance issues you really have to work on it just like any other query that needs to be tuned, but in general you should consider all of the following:

  • Apply the newid order by as late in the process as you can. Have to be careful that moving it doesn't negate the affect you were trying to achieve
  • Consider adding a uniqueidentifier column to the table and indexing it, or adding to one or more existing indexes. You could also just make the uniqueidentifier a primary key, but that's a different article

I hope you found this useful. Do you have other techniques for randomizing? I look forward to your comments.

Total article views: 4316 | Views in the last 30 days: 1
Related Articles

Random Ordering of Results using NEWID

A discussion of how NEWID can be used to help randomize the results returned from a SQL query. A ...


The Perils of NewID for Randomizing Results

NewID() is commonly used in the order by to return the result set in random fashion. No idea whether...


Randomizing Result Sets with NEWID

Seth Delconte brings us a technique to solve a common request. Using the NEWID function to return a ...


Warning... Order by FILTERS results

In variable concatenation fails when using order by newid()


SQL Random selection with NewID()

How to randomly select questions from SQL Server database onto a page Order by NewID().

advanced querying