RAND Unions

  • gautamcs22 (9/29/2014)


    and what is the possibility if the each rand() return different result ?

    Zero, as the first RAND() is seeded.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Something I've never quite understood is why RAND() is like it is - that is, contrary to what perhaps most people would want and expect: creating a (pseudo) random number each time or row it is called, analagous to NEWID().

    If you need a random number to be generated once and then reused for every row there are explicit ways of doing this, say, with variables. But to get a random number generated on each call requires what look like hacks - creating seeds based on time or other table columns.

    I expect the behaviour is a hangover from older language convention; Help says it mimics C. And it can't be changed; but would it be worth asking for RAND2(), with a sort of 'does what it says on the tin' promise?

  • Must be Monday. Read it as how many columns instead of how many rows. Fortunately, I stopped for some coffee on my way in to work and was able to force the eyelids open enough to re-read how many rows.

    These 2 points are compliments of my local Kwik Trip store... have a happy National Coffee Day everyone!

  • andrew.ing (9/29/2014)


    Something I've never quite understood is why RAND() is like it is - that is, contrary to what perhaps most people would want and expect: creating a (pseudo) random number each time or row it is called, analagous to NEWID().

    If you need a random number to be generated once and then reused for every row there are explicit ways of doing this, say, with variables. But to get a random number generated on each call requires what look like hacks - creating seeds based on time or other table columns.

    I expect the behaviour is a hangover from older language convention; Help says it mimics C. And it can't be changed; but would it be worth asking for RAND2(), with a sort of 'does what it says on the tin' promise?

    Surely there are some developers at microsoft that could correct it or at least add an alternative as you say. It must be too low on the "fix" list.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • TomThomson (9/28/2014)


    Good question.

    Just like the last question on Random, I think the explanation could be improved.

    First, the misleading "for a connection" is there again. It shouldn't be. What RAND returns for a given seed is fixed independent of teh connection.

    The second is that the explanation doesn't explan why anything but the first coulum is the same in each of the three components of the union. The reason is that a RAND call with an explicit non-null seed determines the sequence of values generated by that RAND and all subsequent RAND without an explicit seed parameter in the same connection before the next RAND call with an explicit non-null seed parameter.

    RAND calls with an explicit NULL seed parameter aren't relevant to todays question, but in case anyone wonders what there effect is it's quite simple: deliver NULL as result but they don't affect the sequence for calls with no explicit parameter. For example the only difference in the values delivered by the two call sequences

    RAND(100), RAND(), RAND() and

    RAND(100), RAND(), RAND(NULL), RAND()

    is that the second sequence has a NULL in third position, in between the 2nd and 3rd values produced by the first sequence.

    Thanks for the additional explanation. I knew that RAND(100) would always return the same value. I missed that non-seeded calls to RAND would be affected by the prior seeded call.

  • A different rand() value per row is possible with a little bit of convoluted work:

    CREATE VIEW rnd AS SELECT RAND() AS nbr;

    CREATE FUNCTION GetRnd() RETURNS real AS

    BEGIN

    DECLARE @r real

    SET @r = (select nbr from rnd);

    RETURN @r;

    END

    GO

    SELECT name, dbo.GetRnd() AS n

    FROM sys.tables t

    GO

  • I just executed this against the AdventureWorks2012 database hosted by RedGate on Azure. I got one row. Changing the UNION to UNION ALL, I get:

    (No column name)(No column name)(No column name)

    0.7154366573674850.284633807679820.0131039082850364

    0.7154366573674850.284633807679820.0131039082850364

    0.7154366573674850.284633807679820.0131039082850364

    As you can see, the three rows are identical, so you get one row with a plain UNION

  • i totally read the question wrong, for some reason in my head i was thinking how many columns since that was the first thing i was thinking about, but the question was asking about how many rows, i got a super easy answer wrong :(, good question in any case

  • stormsentinelcammy (9/29/2014)


    i totally read the question wrong, for some reason in my head i was thinking how many columns since that was the first thing i was thinking about, but the question was asking about how many rows, i got a super easy answer wrong :(, good question in any case

    My apologies. A few people got caught here. I shouldn't have 3 columns and 3 rows as it can confuse. I'll change the question to 2 columns and 4 rows. Hopefully that makes it more a focus on RAND and UNION.

  • Steve, thanks for the two points on a (relatively) easy question. It's a good way to start off the work week. 🙂

  • SQLRNNR (9/29/2014)


    andrew.ing (9/29/2014)


    Something I've never quite understood is why RAND() is like it is - that is, contrary to what perhaps most people would want and expect: creating a (pseudo) random number each time or row it is called, analagous to NEWID().

    If you need a random number to be generated once and then reused for every row there are explicit ways of doing this, say, with variables. But to get a random number generated on each call requires what look like hacks - creating seeds based on time or other table columns.

    I expect the behaviour is a hangover from older language convention; Help says it mimics C. And it can't be changed; but would it be worth asking for RAND2(), with a sort of 'does what it says on the tin' promise?

    Surely there are some developers at microsoft that could correct it or at least add an alternative as you say. It must be too low on the "fix" list.

    Asking for a function to return a different value in each row seems a little strange. Do you really want to have something like INSERT Pig (InsertTime, Principal, PigID, PigValue) SELECT GETDATE(), Principal, PigID, Value from PendingPig to call GETDATE() once for each of the potentionally 1000s of rows inserted? If not, why should RAND() be any different?

    Adding an alternative function that behaves like NEWID() would make some things easier, perhaps, but how often is it needed? Far more useful (although it wouldn;t address that issue at all) would be a source of genuine random numbers to use when a pseudo-random sequence isn't good enough, but I guess that one has to wait for hardware to become standard.

    Tom

  • stephen.long.1 (9/29/2014)


    Steve, thanks for the two points on a (relatively) easy question. It's a good way to start off the work week. 🙂

    +1

  • +1

  • I got caught up with that keyword versus with an ALL just last week so it was fresh on my mind.

  • TomThomson (9/29/2014)


    SQLRNNR (9/29/2014)


    andrew.ing (9/29/2014)


    Something I've never quite understood is why RAND() is like it is - that is, contrary to what perhaps most people would want and expect: creating a (pseudo) random number each time or row it is called, analagous to NEWID().

    If you need a random number to be generated once and then reused for every row there are explicit ways of doing this, say, with variables. But to get a random number generated on each call requires what look like hacks - creating seeds based on time or other table columns.

    I expect the behaviour is a hangover from older language convention; Help says it mimics C. And it can't be changed; but would it be worth asking for RAND2(), with a sort of 'does what it says on the tin' promise?

    Surely there are some developers at microsoft that could correct it or at least add an alternative as you say. It must be too low on the "fix" list.

    Asking for a function to return a different value in each row seems a little strange. Do you really want to have something like INSERT Pig (InsertTime, Principal, PigID, PigValue) SELECT GETDATE(), Principal, PigID, Value from PendingPig to call GETDATE() once for each of the potentionally 1000s of rows inserted? If not, why should RAND() be any different?

    Adding an alternative function that behaves like NEWID() would make some things easier, perhaps, but how often is it needed? Far more useful (although it wouldn;t address that issue at all) would be a source of genuine random numbers to use when a pseudo-random sequence isn't good enough, but I guess that one has to wait for hardware to become standard.

    That's interesting. I hadn't considered whether GETDATE() was evaluated per row or once for the whole statement. I think I assumed it was evaluated per row - like NEWID() and unlike RAND() (although I might have felt that RAND() was evaluated but returned a static value).

    To turn your example round, I wouldn't feel comfortable writing

    SELECT PigId, PigName, RAND() FROM ...

    even though my intention was, "Generate one random number then use that for every row". The very name of the function is apt to mislead. And I don't know whether I'd trust it to keep the same value - there might be another rule I wasn't aware of. For clarity I'd probably rephrase it using a variable; it would make my intention clearer.

    It's not a big deal of course, but the confusion it causes (hence this QOTD) seems as if it could have been easily averted at the (long past) design stage.

    One use I had was to assign a random sort order value to a set of rows. I had to go through the usual hoops to make RAND() random.

Viewing 15 posts - 16 through 30 (of 30 total)

You must be logged in to reply to this topic. Login to reply