10 Top 1 rows chosen at random

  • Whether or not Peter's specific solution disproves my general statement remains to be seen. How many RDBMSs will it work on?

    As for putting the work in the query. How shall I put this...?

    I've only been coming here for about three months now, but I've noticed that one of the favorite games is a derivative of the old TV show, "Name That Tune":

    Player A: I can execute that query in 0.00000317698 Sec.
    Player B: I can execute that query in 0.00000317534 Sec. I win!

    I'm guessing now, but I think it's going to be fairly difficult to squeeze much more performance out of "SELECT * FROM table". So if we can offload some of the work to an event that happens about a gazillion times less often than a query, why not? How often are we able to squeeze that much performance from a query? Otherwise, why do we play that silly game in the first place.

    Tomm Carr
    --
    Version Normal Form -- http://groups.google.com/group/vrdbms

  • Since this is an SQL Server forum, what does it matter if it will work on any other RDBMS?  You believe in the myth of portable code?

    quote

    I don't think OP was concerned at this point about performance.

    You're absolutely correct... in fact, if you'll go back and read Peter's post a bit more carefully, Peter said that "Running Carr's suggestion reveals that the movies are always the same for every genre".  It's not a matter of performance... it's a matter of you providing an answer that simply didn't work (although I've not tested it myself) any better than hardcoding a result.  You jumped the gun on thinking it was a performance issue just because he posted an execution plan.

    So far as the performance "contests" that some of us engage in, what better way to exchange ideas in the form of code?  At least that has a purpose compared to some other games folks play... How stupid is it for folks to bat a ball against the wall if only to see the opponenent can return it?  How stupid is it for folks to spend a quarter million dollars on a car that runs for 5 seconds or less on the quarter mile and burns 30 gallons of very expensive fuel in the process only to win or lose by milliseconds?  How stupid is it for two grown men to stand in a ring and beat the hell out of each other?

    So far as the OP not being concerned about performance goes, that's probably true... as you said, probably just wanted to know if it could be done.  But we're concerned about performance... how many posts have you seen on these forums asking "How do I tune this query for better performance"?

    So c'mon down off the ceiling and play our silly games with us, Tomm.  It's really a lot of fun if you let it be fun... granted, some folks try to put a little too much "fu" in the word "fun" , but even on those nasty bouts, there always something to learn even if it's how not to do something.  And ya don't even have to get into the ring if you don't want to

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • How are you able to edit the quotes area? I'm not able to reach it when I quote a post.

    ------------------

    I am a believer in taking full advantage of the strengths of the platform chosen for use. However, I'm sure I'm not the only one who comes here who has to work with more than one RDBMS. So I do look for portable solutions when I can find them. Moreover, in this particular instance I had made a general statement and it was challenged based on one specific result. It was just poor argument structure.

    As for my solution not fully satisfying the requirements, I pointed this out myself in my original post. Why do you keep bringing things to my attention that I first brought to yours?

    And I do play the performance game. Search out some of my posts. Performance is important and sometimes critically so. So I was somewhat taken aback at your response that seemed so nonchalant about query performance. I know you have contributed many good solutions here so you know its importance. So I used the word "silly" as a bit of sarcasm at your response.

    Whether my solution satisfies this particular problem is neither here nor there. I get paid the same either way. Going beyond the requirements of this particular problem, there are times when we have an absolute resource hog of a query that just can't be tuned any better than it is. When we hit this kind of brick wall, sometimes (and it doesn't happen nearly often enough) we can improve the query performance by attacking the problem from a completely different angle. Sometimes we can offload some of this resource use to an event (insert and/or update, for example) that occurs far less often than the query and (that it also introduced a bit of randomness which was needed by this particular problem was why I decided to post it, imperfect as it was). Thus we enhance query performance because we've already done some preparatory work.

    Now, Peter's solution wasn't particularly a resource hog, and performance wasn't even on the table at that point, but this problem fit the pattern where a query solution could be fouind outside the query. It wasn't a perfect fit (which I pointed out from the beginning) but I hope it was a good illustration of a different way of approaching a query problem.

    Btw, I'm still not too sure about using NewID() as a random number generator. It is a unique random number generator (meaning that it generates unique values, not that, as a random number generator, it is one-of-a-kind ) It is the uniqueness that concerns me. This points out some underlying processing that could, under the right conditions, be problematic. My questions concerning this have not been addressed. I am not being critical. I have already added Peter's solution to my own bag of tricks -- I just want to know all the possible ramifications before I use it in production code.

    Btw, I do have a solution that doesn't use NewId() and generates a random result set every time the query is executed. But it is not as simple and elegant as Peter's and would have a bit of a performance hit so I haven't posted it. (However, it does have the one benefit of being much more portable, for those of us who have to pay some attention to that.) Also, the listing changes are not really random. It just changes with a pattern that is hidden from the user. I don't look forward to the criticism that it doesn't fit the requirement: "It said random, by gawd, so it's gotta be ipso facto random!" I'll wait to see if there turns out to be a problem with NewId(). I don't really expect there to be any, but everyone knows about assumptions.

    Btw(2), I've been in the software business for close to 25 years. I also post to philosophy forums where, believe me, the debates can really get hot and heavy -- and downright nasty. So, to those of you who occasionally let a little irony slip past them unseen, I really couldn't care less if anything I post is criticized.

    ...as long as it is done properly.

    Tomm Carr
    --
    Version Normal Form -- http://groups.google.com/group/vrdbms

  • Jeff,

    You picked the wrong guy to use sports analogies on. How stupid is it it to hit balls against a fence while others try to catch them or for two guys to stand in a ring beating the hell out of each other? My answer: Pretty Damn Stupid. The only thing more stupid is sitting transfixed in front of the TV for hours at a time watching them do it!

    The way I see it, the fact that some of these idiots make millions of dollars a year is proof that Satan is alive and well and in firm control of the universe -- and he rules not with evil, but with a warped sense of humor.

    Now we can get a decent flame war going...

    Tomm Carr
    --
    Version Normal Form -- http://groups.google.com/group/vrdbms

  • Sorry Tomm... was just trying to lighten things up a bit.  And, maybe I'm taking you the wrong way, but "flame war" about what?  It takes two and I'm out of here

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Well, gee, so was I...

    Tomm Carr
    --
    Version Normal Form -- http://groups.google.com/group/vrdbms

  • We have to wait for OP to return with results.

    And about NEWID() as random generator? We have had several tests (incluing me, Jeff and Michael) to see if ABS(CHECKSUM(NEWID())) generated evenly distributed random result. And it seemed to do this.

     


    N 56°04'39.16"
    E 12°55'05.25"

  • I wasn't doubting the ability of NewID() to generate random results. Nor was I concerned with using it as you did in generating a random sequence of genre IDs for your test table.

    My question concerns it use in the query to do the sorting. I had, at first glance, thought that NewID() was called once per genre in your select statement for a total of 10 times. In looking at it again, I realize it is called for every row! My concern is with calling NewID() so many time, so often (potentially many Ks per query times who-knows-how-many queries per second/minute/day/whatever). When the developers wrote NewID(), they were under the impression that it would be executed predominately in conjunction with INSERT statements -- which occur relatively infrequently.

    Unfortunately, there is no one here that knows the internals of SS well enough to tell how NewID() tracks values it generates to ensure it never generates the same value again (as statistically improbable as that may be, given the size of a GUID). For all I know, it relies on that statistical improbability and performs no checking at all. After all, if it can generate values 100 times per second for 5 thousand years before it approaches a 50/50 chance of generating the same value, how much effort do you need to make to guarantee it doesn't? (I made up those figures, I have no idea what the actual values would be.)

    If that is the case, no problem. It is, I thought, a simple question but it seems to have generated a lot of ink (photons).

    In the meantime, I think OP has completely lost interest.

    Addendum: In the third paragraph above, when I said "no one here..." I meant here where I work, not here this forum. Actually, I'm hoping there is someone here that knows definitively.

    Tomm Carr
    --
    Version Normal Form -- http://groups.google.com/group/vrdbms

  • NEWID() is guaranteed to be unique within the database.

    This gives two hints. Date and Time is involved and so is the Database name.

    Others are (by rumour) MAC address, bios information and video card.

     

     


    N 56°04'39.16"
    E 12°55'05.25"

Viewing 9 posts - 16 through 23 (of 23 total)

You must be logged in to reply to this topic. Login to reply