• Jeff Moden (3/26/2012)


    Michael Valentine Jones (3/26/2012)


    Here is an alternate method that I use to generate the pseudo random numbers. The basic method is to take the right 7 bytes from the NEWID function and convert that to a BIGINT before applying the MODULUS operator. No need for the ABS function, since 7 bytes can only produce a positive BIGINT number.

    if object_id('tempdb..#t','U') is not null begin drop table #t end

    -- Generate 20,000,000 rows

    select top 20000000

    NUMBER = identity(int,1,1)

    into

    #t

    from

    (select top 4473 * from master.dbo.syscolumns) a

    cross join

    (select top 4473 * from master.dbo.syscolumns) b

    -- Show distribution of rowcount around average of 40000

    select

    a.RandomNo,

    Rows = count(*)

    from

    (

    select

    RandomNo =

    (convert(bigint,convert(varbinary(7),newid()))%500)+1

    from

    #t aa

    ) a

    group by

    a.RandomNo

    order by

    count(*),

    a.RandomNo

    RandomNo Rows

    -------------------- -----------

    335 39455

    3 39457

    76 39481

    426 39489

    494 39535

    242 39539

    278 39539

    490 39548

    445 39553

    244 39566

    ...

    ...

    ...

    124 40400

    228 40402

    425 40410

    286 40434

    45 40458

    463 40463

    373 40531

    152 40586

    (500 row(s) affected)

    Like I said in the article, the conversion to VARBINARY will slow things down and to no good end if you don't really need BIGINT for the random integer. If you really want BIGINT capability (and I realize that wasn't one of your goals in your example), I believe you'd also have to convert the whole NEWID() to VARBINARY.

    I also thought you were involved in some testing that showed the use of the square root of the final number of desired rows as a TOP for the self joined table in the Cross Join really wasn't worth it.

    The main point that I'm trying to make is that if it's too complicated, folks won't use it.

    I never was involved in any testing with cross joins using the square root of number of desired rows. I just grabbed a piece of code I had laying to be able to generate a test table, and I wasn't trying to say that was the way you should do it.

    Have you actually tested to confirm that the conversion to VARBINARY and then to BIGINT is slower than CHECKSUM and ABS? I haven't but I would be surprised if there is much difference.

    I was just showing an alternate method to get a random number. I think it is useful in cases where the range of random numbers is greater than an INT.

    Of course, if you want a random full BIGINT, you would do something like this, instead of stopping at 7 bytes; I just stopped at 7 bytes to eliminate the ABS function.

    select convert(bigint,convert(varbinary(8),newid()))