Query help

  • I have a table A. It has 2000 rows. First distort the order of the table so there is no specifiec order in the table. Then I need to pick 1 row out of every 5 rows in a random order. So, we need to pick a row from 1-5, a row from 6-10 randomly and so on...

    Any idea?

  • Any idea of you giving us some readily consumable data to work on the query?

  • Use a cte to apply a NEWID() to each row, and use a Row_Number() OVER (ORDER BY NULL) to apply a rowNumber to each row.

    Now, apply a groupNumber by taking the rownumber%5 to break each five rows into their own group, and take a top 1 of each group (cross apply + tally can work for this) ordered by the NewID.

    If you can provide sample schema/data, I can walk you through the code.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Not sure if this is exactly what you want, but go to the FROM clause for T-SQL and check out the <tablesample_clause> .

  • Order by NEWID and take the TOP 400 🙂

  • Thanks a lot Evil. I think that works.

  • ColdCoffee (3/22/2012)


    Order by NEWID and take the TOP 400 🙂

    I have to say that I agree with that. If the rows are in the random order provided by NEWID(), there's no benefit to picking one out of every group of 5 in a further attempt at randomizing the selection. Well... unless it's an interview question. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/22/2012)


    ColdCoffee (3/22/2012)


    Order by NEWID and take the TOP 400 🙂

    I have to say that I agree with that. If the rows are in the random order provided by NEWID(), there's no benefit to picking one out of every group of 5 in a further attempt at randomizing the selection. Well... unless it's an interview question. 😉

    I couldnt have said it better, Jeff. 100% agree. No point in ordering the result set, NTILEing, and then randomizing the result; Instead, just order by newID() and the pick the top 400, which, IMHO, gives more or less logically the same result..

  • Or you could do this from Books Online:

    J. Using TABLESAMPLE to read data from a sample of rows in a table

    The following example uses TABLESAMPLE in the FROM clause to return approximately 10 percent of all the rows in the Customer table in the AdventureWorks2008R2 database.

    USE AdventureWorks2008R2 ;

    GO

    SELECT *

    FROM Sales.Customer TABLESAMPLE SYSTEM (10 PERCENT) ;

  • This didn't work. The difference of RowID between each records shouldn't exceed more than 8. In other words, a record from each group should output sequentially.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply