• Here is how to generate a sample set of multinomially distributed random numbers. First, you need to create a TYPE and a FUNCTION by running this script:

    CREATE TYPE Distribution AS TABLE (EventID INT, EventProb FLOAT, CumProb FLOAT)

    GO

    CREATE FUNCTION dbo.RN_MULTINOMIAL

    (@Multinomial Distribution READONLY, @URN FLOAT)

    RETURNS INT --Cannot use WITH SCHEMABINDING

    AS

    BEGIN

    RETURN

    ISNULL(

    ( SELECT TOP 1 EventID

    FROM @Multinomial

    WHERE @URN < CumProb

    ORDER BY CumProb)

    -- Handle unlikely case where URN = exactly 1.0

    ,( SELECT MAX(EventID)

    FROM @Multinomial))

    END

    Next, you need to set up your multinomial probability distribution table as follows:

    DECLARE @MultinomialProbabilities Distribution

    ;WITH Tally (n) AS (

    SELECT TOP 28 ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM sys.all_columns)

    INSERT INTO @MultinomialProbabilities

    SELECT n

    ,CASE n WHEN 1 THEN .6/26. WHEN 2 THEN .3 WHEN 3 THEN .1 ELSE .6/26. END

    ,CASE n WHEN 1 THEN .6*1./26. WHEN 2 THEN .3+.6*1./26. WHEN 3 THEN .4+.6*1./26. ELSE .4+.6*(n-2)/26. END

    FROM Tally

    SELECT * FROM @MultinomialProbabilities

    Note how the EventProb column shows .3 for event 2 and .1 for event 3. The rest are all the remaining probability (.6) divided by the number of events (26). The last column is the cumulative probability for all previous events (last row should show 1).

    The hard part is now behind us.

    Now, within the same SQL batch as the above, this test harness tests the generated random numbers so you can compare to the distribution's expected frequency.

    DECLARE @TestNums INT = 1000

    ;WITH Tally (n) AS (

    SELECT TOP (@TestNums) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM sys.all_columns a CROSS JOIN sys.all_columns b)

    SELECT MNRN, CountOfMNRNs=COUNT(MNRN), ActualProbability=COUNT(MNRN)/(1.*@TestNums)

    FROM (

    SELECT MNRN=dbo.RN_MULTINOMIAL(@MultinomialProbabilities, URN)

    FROM Tally

    CROSS APPLY (SELECT URN=RAND(CHECKSUM(NEWID()))) a

    ) a

    INNER JOIN @MultinomialProbabilities ON EventID=MNRN

    GROUP BY MNRN

    The key to generating a group of random numbers is the part I highlighted in bold/ This generates a sample set based on the value of @TestNums. The rest of it just groups by EventID and calculates the actual probability. This should center around 0.23 for all events except 2 and 3, which should be close to .3 and .1. The more numbers you generate, the closer they should be to the actual distribution.

    Hope this helps.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St