Random Sample

  • Comments posted to this topic are about the item Random Sample

  • Thanks, a very good question.

    Igor Micev,My blog: www.igormicev.com

  • It can be even used for random of N numbers of a table

    like for e.g in this case:

    SELECT TOP 2 *

    FROM #Students

    ORDER BY NEWID()

    Igor Micev,My blog: www.igormicev.com

  • Nice question, thanks.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Nice one

    Thanks

  • I'm sure I'm missing something very basic here but I was under the impression that because there is no ordering automatically on a table, the row returned by TOP n is not ever guaranteed to be the same. Admittedly you could run a query 20000 times and get the same result but I didn't think run 20001 would necessarily produce the same value. Can somebody enlighten me please?


    On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
    —Charles Babbage, Passages from the Life of a Philosopher

    How to post a question to get the most help http://www.sqlservercentral.com/articles/Best+Practices/61537

  • Well.. I think you are partially right,

    The fact there is no ORDER BY does not guarantee the records will be returned in order,

    though that is not good enough to get random samples.

    Most of the times the without ORDER BY select will return the same result sets on multiple invocation.

    That is a nice and easy way to get samples, thank you to the author.

    Iulian

  • Iulian -207023 (7/7/2014)


    I'm sure I'm missing something very basic here but I was under the impression that because there is no ordering automatically on a table, the row returned by TOP n is not ever guaranteed to be the same. Admittedly you could run a query 20000 times and get the same result but I didn't think run 20001 would necessarily produce the same value. Can somebody enlighten me please?

    Well I think you are partially right,

    The fact there is no order by does not quarantee the records will be returned in order.

    But that is not good enough to get random samples.

    I fully appreciate that it's not good enough for random sampling. Equally, sometimes the right answer can hinge on a very subtle nuance of SQL Server and, depending on interpretation, any one of the answers could be correct. In this case it's much more clearcut and I'm not disagreeing or nitpicking, I was just making sure that I hadn't misunderstood something fundamental.


    On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
    —Charles Babbage, Passages from the Life of a Philosopher

    How to post a question to get the most help http://www.sqlservercentral.com/articles/Best+Practices/61537

  • BWFC (7/7/2014)


    I'm sure I'm missing something very basic here but I was under the impression that because there is no ordering automatically on a table, the row returned by TOP n is not ever guaranteed to be the same. Admittedly you could run a query 20000 times and get the same result but I didn't think run 20001 would necessarily produce the same value. Can somebody enlighten me please?

    You are right. Option A, with no Order By, does (apparently) return row 1 every time, at least in SQL2014. But there is no guarantee, as the sequence is not defined. You might get a different result if you ran it 20001 times. It's quite possible (though unlikely) that SQL2016 will always return row 5. Or row 2. Or a random row.

  • initially I was under the impression that RAND(CAST(GETDATE() AS INT)) will generate a random order key.

    But probably it will be random only if ran in different days, though if adjusting the random seed to be another one on each run, i.e. using seconds in the seed, that might be good enough to get random samples.

  • Nice question Christian. Thanks for contributing.

    [font="Verdana"]Please don't go. The drones need you. They look up to you.[/font]
    Connect to me on LinkedIn

  • Nice straight forward question!

  • This was removed by the editor as SPAM

  • XKCD 221[/url] suddenly pops to mind...

  • dhober (7/7/2014)


    XKCD 221[/url] suddenly pops to mind...

    :hehe: Thanks for the question. I had to use this method not too long ago, so it was still fresh on my mind.



    Everything is awesome!

Viewing 15 posts - 1 through 15 (of 26 total)

You must be logged in to reply to this topic. Login to reply