select several lines randomly according to several categories

  • abdalah.mehdoini

    SSC Eights!

    Points: 853

    Hello,

    I have the following dataset:

    Family Room subfamily

    1            5               6

    2            5               7

    3             5              7

    4            2               3

    I would like to extract the data from this table randomly. It would take in the extraction a piece taken at random by couple family / sub-family.

    For this dataset for example the extraction would be:

    Family Room subfamily

    1            5          6

    3            5          7

    4           2           3

    I do not know how to attack the request to have the expected result ... If someone has clues I am a taker.

    Thanks in advance.

  • Alan Burstein

    SSC Guru

    Points: 61067

    You can use TOP and ORDER BY NEWID() like this:

    -- Sample data
    DECLARE @yourdata TABLE
    (
    Family INT,
    Room INT,
    subfamily INT
    );
    INSERT @yourdata VALUES (1,5,6),(2,5,7),(3,5,7),(4,2,3),(4,2,4);

    -- Number of Rows
    DECLARE @rows INT = 3;

    SELECT TOP (@rows) y.Family, y.Room, y.subfamily
    FROM @yourdata AS y
    ORDER BY NEWID();

    You can return a random number of rows using CHECKSUM and NEWID like this:

    DECLARE @yourdata TABLE 
    (
    Family INT,
    Room INT,
    subfamily INT
    );
    INSERT @yourdata VALUES (1,5,6),(2,5,7),(3,5,7),(4,2,3),(4,2,4);

    -- number of rows to return (e.g. 2 to 4 rows)
    DECLARE @low INT = 2, @high INT = 4;

    SELECT TOP (ABS(CHECKSUM(NEWID())%(@high-@low+1))+@low) y.Family, y.Room, y.subfamily
    FROM @yourdata AS y
    ORDER BY NEWID();
    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • jcelko212 32090

    SSCrazy Eights

    Points: 8881

    >> I have the following dataset: <<

    And thanks to your lack of netiquette, we have no DDL. When you're at work do you ask other people to do the basic things that an SQL programmer should do because you can't do them yourself? we have no datatypes, no keys no constraints. And even column names are vague.

    >> I would like to extract the data from this table randomly. It would take in the extraction a piece taken at random by couple family / sub-family. I do not know how to attack the request to have the expected result ... If someone has clues I am a taker. <<

    What sampling method do you want to use? Random? Stratified? SQL was not meant for this and the attempts to paste it into SQL have been disappointing. How big a sample do you want to get from your population? To do this right, you're going to have to know some basic statistics and it doesn't sound like you do.

    You might want to look at an old, but still, a very good introductory book entitled "A Sampler on Sampling" by Bill Williams (ISBN 0471030368).

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • Alan Burstein

    SSC Guru

    Points: 61067

    And thanks to your lack of netiquette, we have no DDL.

    I think what I posted should answer the OP's question as well as yours Joe.

     

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • jcelko212 32090

    SSCrazy Eights

    Points: 8881

    > You can return a random number of rows using CHECKSUM and NEWID <<

    Is this method a uniform random thing? I think I remember that a lot of these SQL Server trick skew toward the rows that are at the front of a data page ...

     

     

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • rVadim

    Hall of Fame

    Points: 3938

    Why not use TABLESAMPLE?

    SELECT * FROM YourTable TABLESAMPLE(100 ROWS)
    SELECT * FROM YourTable TABLESAMPLE(10 PERCENT)

    --Vadim R.

  • Sumathi

    SSC Rookie

    Points: 28

    This can be done using

    SELECT TOP (@rows) y.Family, y.Room, y.subfamily

    FROM @Table AS y

    ORDER BY NEWID();

    OR

    SELECT y.Family, y.Room, y.subfamily

    FROM @Table AS y TABLESAMPLE(3 ROWS)

    But, the TABLESAMPLE clause cannot be applied to derived tables, tables from linked servers, tables derived from table-valued functions, rowset functions or OPENXML or from views.

Viewing 7 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply