March 22, 2012 at 3:52 pm
I have a table A. It has 2000 rows. First distort the order of the table so there is no specifiec order in the table. Then I need to pick 1 row out of every 5 rows in a random order. So, we need to pick a row from 1-5, a row from 6-10 randomly and so on...
Any idea?
March 22, 2012 at 3:56 pm
Any idea of you giving us some readily consumable data to work on the query?
March 22, 2012 at 3:59 pm
Use a cte to apply a NEWID() to each row, and use a Row_Number() OVER (ORDER BY NULL) to apply a rowNumber to each row.
Now, apply a groupNumber by taking the rownumber%5 to break each five rows into their own group, and take a top 1 of each group (cross apply + tally can work for this) ordered by the NewID.
If you can provide sample schema/data, I can walk you through the code.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
March 22, 2012 at 4:06 pm
Not sure if this is exactly what you want, but go to the FROM clause for T-SQL and check out the <tablesample_clause> .
March 22, 2012 at 4:35 pm
Order by NEWID and take the TOP 400 🙂
March 22, 2012 at 6:04 pm
Thanks a lot Evil. I think that works.
March 22, 2012 at 11:02 pm
ColdCoffee (3/22/2012)
Order by NEWID and take the TOP 400 🙂
I have to say that I agree with that. If the rows are in the random order provided by NEWID(), there's no benefit to picking one out of every group of 5 in a further attempt at randomizing the selection. Well... unless it's an interview question. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
March 22, 2012 at 11:13 pm
Jeff Moden (3/22/2012)
ColdCoffee (3/22/2012)
Order by NEWID and take the TOP 400 🙂I have to say that I agree with that. If the rows are in the random order provided by NEWID(), there's no benefit to picking one out of every group of 5 in a further attempt at randomizing the selection. Well... unless it's an interview question. 😉
I couldnt have said it better, Jeff. 100% agree. No point in ordering the result set, NTILEing, and then randomizing the result; Instead, just order by newID() and the pick the top 400, which, IMHO, gives more or less logically the same result..
March 22, 2012 at 11:54 pm
Or you could do this from Books Online:
J. Using TABLESAMPLE to read data from a sample of rows in a table
The following example uses TABLESAMPLE in the FROM clause to return approximately 10 percent of all the rows in the Customer table in the AdventureWorks2008R2 database.
USE AdventureWorks2008R2 ;
GO
SELECT *
FROM Sales.Customer TABLESAMPLE SYSTEM (10 PERCENT) ;
March 23, 2012 at 9:14 am
This didn't work. The difference of RowID between each records shouldn't exceed more than 8. In other words, a record from each group should output sequentially.
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply