Random values

  • Nice question. Thanks for sharing

  • Good question, Steve, thanks. It seemed too easy for a two-pointer, which made me reexamine the question, thinking that there was a "trick" hidden somewhere, but I concluded that my first instinct was correct (and it was!)

  • TomThomson (9/25/2014)


    Good question, but two small flaws:

    1: the explanation is misleading. It says "The RAND function only produces one value for all calls in a specific connection with a specific seed" which would tend to make people think that a given seed might produce different values in different connectipns. It doesn't: the for a given seed the value produced by RAND is alwys the same, regardless of what connection it is called in.

    2: the values produced are not random, RAND with a seed is deterministic (as clearly stated in BOL Deterministic

    and Nondeterministic Functions) - the value for a given seed is always the same.

    +1

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • edwardwill (9/25/2014)


    Can you explain why you aliased the sys.syscolumns table?

    SELECT TOP 10

    RAND( ROW_NUMBER() OVER (ORDER BY id))

    FROM sys.syscolumns

    is functionally identical, as far as I am aware.

    habit

  • Far be it from me to disagree with a guru of your stature, but personally I find aliasing makes queries difficult to read. I only ever alias where strictly necessary (JOINing the same table >1 time, for example). I guess it helps that I am a touch typist so the keyboard work isn't an issue!

  • Thank you for the question Steve

    😎

  • TomThomson (9/25/2014)


    2: the values produced are not random, RAND with a seed is deterministic (as clearly stated in BOL Deterministic

    and Nondeterministic Functions) - the value for a given seed is always the same.

    Tom,

    A simple question for I agree that the result of the RAND is deterministic and the basis of your statement is correct on face value. The question is this, can a functional loop containing a deterministic function create a random result, or should the result be considered deterministic due to the internal function?

    Thanks,

    M.

    Not all gray hairs are Dinosaurs!

  • There's nothing right or wrong about aliasing. It's preference.

    I try to use them often, because I find many queries quickly get complex and it's much harder for me to read:

    select

    Person.FirstName

    , Person.LastName

    , Person.JobTitle

    , PersonEmail.EmailAddress

    than

    select

    p.FirstName

    , p.LastName

    , p.JobTitle

    , ea.EmailAddress

    No disagreement here. It's preference.

  • TomThomson (9/25/2014)


    Good question, but two small flaws:

    1: the explanation is misleading. It says "The RAND function only produces one value for all calls in a specific connection with a specific seed" which would tend to make people think that a given seed might produce different values in different connectipns. It doesn't: the for a given seed the value produced by RAND is alwys the same, regardless of what connection it is called in.

    2: the values produced are not random, RAND with a seed is deterministic (as clearly stated in BOL Deterministic

    and Nondeterministic Functions) - the value for a given seed is always the same.

    The name of the function is somewhat deceiving

    😎

  • Many people think of this as random. I was trying to point out that there can be issues. I wasn't trying to point out all issues, though I'll change the explanation to say this is deterministic with a seed.

  • Just being a "little" pedantic and of course way off topic, for ad hoc sequence generation, sys.syscolumns is not the best choice, consider these three queries

    😎

    /* Query one, relative cost 43% */

    SELECT TOP 10

    RAND( ROW_NUMBER() OVER (ORDER BY (select null))) AS RND_VAL

    FROM sys.sysobjects AS s

    /* Query two, relative cost 56% */

    SELECT TOP 10

    RAND( ROW_NUMBER() OVER (ORDER BY (select null))) AS RND_VAL

    FROM sys.syscolumns AS s

    /* Query two, relative cost 1% */

    ;WITH T(N) AS (SELECT N FROM ( VALUES (NULL),(NULL),(NULL),

    (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) AS X(N))

    ,NUMS(N) AS (SELECT TOP(10) ROW_NUMBER()

    OVER (ORDER BY (SELECT NULL)) AS

    N FROM T T1,T T2,T T3,T T4,T T5,T T6,T T7)

    SELECT

    RAND(NM.N) AS RND_VAL

    FROM NUMS NM;

  • Miles Neale (9/25/2014)


    Tom,

    A simple question for I agree that the result of the RAND is deterministic and the basis of your statement is correct on face value. The question is this, can a functional loop containing a deterministic function create a random result, or should the result be considered deterministic due to the internal function?

    Thanks,

    M.

    Miles, it depends what else is in the loop - if for example it is feeding random parameters to the deterministic function the results may be random; it could get random data from counting delays between keystrokes; in system with multiple things going on and certain types of scheduling it could try calling a delay function (for say a 5 second delay), reading the clock, and taking the number of milliseconds more than 5000 since its last clock read once or twice in each iteration (the first difference in the first iteration is difference from clock value before the loop starts, in each subsequent iteration it is the difference from the last clock read in the previous iteration), it could be reading data from a source of noise through an ADC and so on. If the loop's effect depends on anything that's actually random, the fact that it also uses things that are not random doesn't mean that that effect is not random.

    In the QOTD, there was no source of randomness at all - the seeds were the integers 1 to 10 with no possibility of their being anything else, so that was non-random.

    Tom

  • Except that I would write

    SELECT

    FirstName,

    Surname,

    Address,

    EmailAddress

    FROM

    Person

    As you say, it's all preference. For simple queries such as the above I don't see any value in prepending the table name, unless either the human reading the query needs to know the source table of the column, or it's syntactically required because the column is ambiguous (the same name appearing in more than one table in the query). I've recently had to refactor a very large query with around twenty or so aliased tables, which the author had assigned in alphabetical order (a., b., c. etc.) I could only finally understand what was going on by doing a global replace with the actual table names.

  • Steve Jones - SSC Editor (9/25/2014)


    There's nothing right or wrong about aliasing. It's preference.

    I try to use them often, because I find many queries quickly get complex and it's much harder for me to read:

    select

    Person.FirstName

    , Person.LastName

    , Person.JobTitle

    , PersonEmail.EmailAddress

    than

    select

    p.FirstName

    , p.LastName

    , p.JobTitle

    , ea.EmailAddress

    No disagreement here. It's preference.

    The mixed mode in the QotD seems odd to me, though: you define the alias s so surely you should use s.ID instead of just ID unqualified just to be consistent? Not that it makes the slightest difference, of course. It is, as you say, a matter of preference.

    Tom

  • TomThomson (9/26/2014)


    The mixed mode in the QotD seems odd to me, though: you define the alias s so surely you should use s.ID instead of just ID unqualified just to be consistent? Not that it makes the slightest difference, of course. It is, as you say, a matter of preference.

    My guess is that Steve had SQL Prompt active while typing the question. SQL Prompt has an option to automatically add an alias to the table (with a pretty nifty algorithm for choosing the alias).

    The only thing I dislike about this option is that it ALWAYS adds the alias, whereas my personal coding standard is to always use aliases for all tables in queries that use two or more tables, but use no alias in single-query tables - so for those queries I have to manually remove the alias that SQL Prompt adds.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Viewing 15 posts - 16 through 30 (of 47 total)

You must be logged in to reply to this topic. Login to reply