Update query will not run?

  • The following query gives me a random date between 1 and 28 days after the arrival date:

    SELECT ArrivalDate, DATEADD(day, 1 + RAND(checksum(NEWID()))

    * LengthOfStay.LengthofStay, ArrivalDate) AS DepartureDate

    FROM Bookings, LengthOfStay

    However when I use the update query below it only gives me between 1 and 2 days after the arrival date

    USE Occupancy

    Update B

    Set DepartureDate = DATEADD(day, 1 + RAND(checksum(NEWID()))*1.5 * L.LengthofStay, B.ArrivalDate)

    FROM LengthOfStay L, Bookings B

    Does anyone know why, and if so how do I change it?

    Thanks

    Wayne

  • wafw1971 (2/20/2013)


    The following query gives me a random date between 1 and 28 days after the arrival date:

    SELECT ArrivalDate, DATEADD(day, 1 + RAND(checksum(NEWID()))

    * LengthOfStay.LengthofStay, ArrivalDate) AS DepartureDate

    FROM Bookings, LengthOfStay

    However when I use the update query below it only gives me between 1 and 2 days after the arrival date

    USE Occupancy

    Update B

    Set DepartureDate = DATEADD(day, 1 + RAND(checksum(NEWID()))*1.5 * L.LengthofStay, B.ArrivalDate)

    FROM LengthOfStay L, Bookings B

    Does anyone know why, and if so how do I change it?

    Thanks

    Wayne

    If you run this multiple times:

    SELECT 1 + RAND(checksum(NEWID()))

    You should see that the number returned is always between 1 and 2.

    If you want it to return a number between 1 and 28, use this:

    SELECT 1 + ABS(checksum(NEWID())) % 28


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • I have just been told I have done it completely wrong by my boss and the query above hasn't randomised anything on our data, I have got 48000 records with a 2 night stay, 48000 records for a 27 night stay etc.

    What I need is 30% of the departure dates to be 2 days in length, 10% to be 3 Days and the rest to be randomised amongst 1,4 to 28.

    Can anyone help?

  • wafw1971 (2/20/2013)


    I have just been told I have done it completely wrong by my boss and the query above hasn't randomised anything on our data, I have got 48000 records with a 2 night stay, 48000 records for a 27 night stay etc.

    What I need is 30% of the departure dates to be 2 days in length, 10% to be 3 Days and the rest to be randomised amongst 1,4 to 28.

    Can anyone help?

    Possibly. What you need is to generate random numbers based on a multinomial distribution. See the second article in my signature links (about random number generators in SQL) for a function that will do this.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Hi Dwain

    Thanks for the link but to be honest its way above where I am in my training (I'm 4 weeks in) and to be honest that looks like a different language.

    My boss said I should be able to find a case statement along the lines if a Random number genrte between 0 and 1 then between 0 and 0.3 is a 2 day stay and between 0.3 to 0.4 is a 3 day stay else 0.4 to 1 will all other days between 1 and 28.

    But I don't know how to write the code.

    Can you help or point me in the right direction.

    Ta

    Wayne

  • OK. Give me a few minutes and I'll code up an example.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Hi Dwain

    I have tried to breakdown what I need below, I hope this helps.

    Thanks

    Wayne

    Step 1

    Arrival Date (Already generated) – 1.35 Million Times

    Step 2

    Randomise a number between 0 and 1

    Step 3

    Use the Randomised number produced above to create the script below

    UPDATE BOOKINGS

    SET DepartureDate

    CASE WHEN RAND() Result = Between 0 and 0.3 = Departure Date will be 2 Nights Later

    CASE WHEN RAND() Result = Between 0.3 and 0.4 = Departure Date will be 3 Nights Later

    CASE WHEN RAND ()Result >0.4 = Departure Date will be either 1,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28 Nights Later

  • Here is how to generate a sample set of multinomially distributed random numbers. First, you need to create a TYPE and a FUNCTION by running this script:

    CREATE TYPE Distribution AS TABLE (EventID INT, EventProb FLOAT, CumProb FLOAT)

    GO

    CREATE FUNCTION dbo.RN_MULTINOMIAL

    (@Multinomial Distribution READONLY, @URN FLOAT)

    RETURNS INT --Cannot use WITH SCHEMABINDING

    AS

    BEGIN

    RETURN

    ISNULL(

    ( SELECT TOP 1 EventID

    FROM @Multinomial

    WHERE @URN < CumProb

    ORDER BY CumProb)

    -- Handle unlikely case where URN = exactly 1.0

    ,( SELECT MAX(EventID)

    FROM @Multinomial))

    END

    Next, you need to set up your multinomial probability distribution table as follows:

    DECLARE @MultinomialProbabilities Distribution

    ;WITH Tally (n) AS (

    SELECT TOP 28 ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM sys.all_columns)

    INSERT INTO @MultinomialProbabilities

    SELECT n

    ,CASE n WHEN 1 THEN .6/26. WHEN 2 THEN .3 WHEN 3 THEN .1 ELSE .6/26. END

    ,CASE n WHEN 1 THEN .6*1./26. WHEN 2 THEN .3+.6*1./26. WHEN 3 THEN .4+.6*1./26. ELSE .4+.6*(n-2)/26. END

    FROM Tally

    SELECT * FROM @MultinomialProbabilities

    Note how the EventProb column shows .3 for event 2 and .1 for event 3. The rest are all the remaining probability (.6) divided by the number of events (26). The last column is the cumulative probability for all previous events (last row should show 1).

    The hard part is now behind us.

    Now, within the same SQL batch as the above, this test harness tests the generated random numbers so you can compare to the distribution's expected frequency.

    DECLARE @TestNums INT = 1000

    ;WITH Tally (n) AS (

    SELECT TOP (@TestNums) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM sys.all_columns a CROSS JOIN sys.all_columns b)

    SELECT MNRN, CountOfMNRNs=COUNT(MNRN), ActualProbability=COUNT(MNRN)/(1.*@TestNums)

    FROM (

    SELECT MNRN=dbo.RN_MULTINOMIAL(@MultinomialProbabilities, URN)

    FROM Tally

    CROSS APPLY (SELECT URN=RAND(CHECKSUM(NEWID()))) a

    ) a

    INNER JOIN @MultinomialProbabilities ON EventID=MNRN

    GROUP BY MNRN

    The key to generating a group of random numbers is the part I highlighted in bold/ This generates a sample set based on the value of @TestNums. The rest of it just groups by EventID and calculates the actual probability. This should center around 0.23 for all events except 2 and 3, which should be close to .3 and .1. The more numbers you generate, the closer they should be to the actual distribution.

    Hope this helps.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Perhaps one additional clarification.

    To generate a single multinomial random number, you do it like this:

    SELECT MNRN=dbo.RN_MULTINOMIAL(@MultinomialProbabilities, URN)

    FROM (SELECT URN=RAND(CHECKSUM(NEWID()))) a

    Provided because I wasn't sure if when you referring to your skill level you meant in SQL or statistics.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Thank you for that I will try and get my head around it tomorrow.

  • I'll help a little further. The script you posted (althought syntactically incorrect) does exactly what the RN_MULTINOMIAL function does:

    UPDATE BOOKINGS

    SET DepartureDate

    CASE WHEN RAND() Result = Between 0 and 0.3 = Departure Date will be 2 Nights Later

    CASE WHEN RAND() Result = Between 0.3 and 0.4 = Departure Date will be 3 Nights Later

    CASE WHEN RAND ()Result >0.4 = Departure Date will be either 1,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28 Nights Later

    To put this in a context that might be more familiar to you, what you'll want to do is something like this (after setting up the @MultinomialProbabilities Distribution table:

    UPDATE b

    SET DepartureDate = DATEADD(day,

    dbo.RN_MULTINOMIAL(@MultinomialProbabilities, URN), Arrivaldate)

    FROM BOOKINGS b

    CROSS APPLY (SELECT URN=RAND(CHECKSUM(NEWID()))) a

    Your boss should be happy because you used a CASE statement too (in the setup of the @MultinomialProbabilities Distribution table). 😛


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • I was given a simple way of doing this:

    UPDATE BOOKINGS

    SET DepartureDate =

    DATEADD(day,

    CASE WHEN Rand(CHECKSUM(NEWID())) BETWEEN 0 and 0.3 THEN 2 ELSE

    CASE WHEN Rand(CHECKSUM(NEWID())) BETWEEN 0.3 and 0.5 THEN 3 ELSE

    Round(Rand(CHECKSUM(NEWID())) * 28,0) END END,ArrivalDate)

    Thanks for all your help.

  • wafw1971 (2/21/2013)


    I was given a simple way of doing this:

    UPDATE BOOKINGS

    SET DepartureDate =

    DATEADD(day,

    CASE WHEN Rand(CHECKSUM(NEWID())) BETWEEN 0 and 0.3 THEN 2 ELSE

    CASE WHEN Rand(CHECKSUM(NEWID())) BETWEEN 0.3 and 0.5 THEN 3 ELSE

    Round(Rand(CHECKSUM(NEWID())) * 28,0) END END,ArrivalDate)

    Thanks for all your help.

    I suggest you double check your math but good for you.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Hi Dwain,

    I didn't mean to offend you, if that's the case I am sorry. My boss helped me with that query, is it not right?

    1.35 Million Records of which

    30% should 2 night stays

    20% should be 3 night stays

    and the other 50% to be randomised between 1, 4 and 28 days.

    I do have another question if you could help in the same vain.

    I now need to make 15% of them cancelled by inserting a random Cancelled Date. However the cancelled date must be =>Booking Date and <=Arrival Date.

    I have completed the random section but I now need to know how to add greater than and less than part to the query:

    Can you help?

    SELECT ArrivalDate,

    DATEADD(day,

    CASE WHEN Rand(CHECKSUM(NEWID())) BETWEEN 0 and 0.85 THEN NULL ELSE

    CASE WHEN Rand(CHECKSUM(NEWID())) BETWEEN 0.85 and 0.88 THEN 0 ELSE

    CASE WHEN Rand(CHECKSUM(NEWID())) BETWEEN 0.88 and 0.92 THEN -1 ELSE

    CASE WHEN Rand(CHECKSUM(NEWID())) BETWEEN 0.92 and 0.97 THEN -7 ELSE

    Round(Rand(CHECKSUM(NEWID())) * -90,0) END END END END,ArrivalDate) AS DaystoReduce

    FROM Bookings

    Thanks

    Wayne

  • wafw1971 (2/21/2013)


    Hi Dwain,

    I didn't mean to offend you, if that's the case I am sorry. My boss helped me with that query, is it not right?

    1.35 Million Records of which

    30% should 2 night stays

    20% should be 3 night stays

    and the other 50% to be randomised between 1, 4 and 28 days.

    I do have another question if you could help in the same vain.

    I now need to make 15% of them cancelled by inserting a random Cancelled Date. However the cancelled date must be =>Booking Date and <=Arrival Date.

    I have completed the random section but I now need to know how to add greater than and less than part to the query:

    Can you help?

    SELECT ArrivalDate,

    DATEADD(day,

    CASE WHEN Rand(CHECKSUM(NEWID())) BETWEEN 0 and 0.85 THEN NULL ELSE

    CASE WHEN Rand(CHECKSUM(NEWID())) BETWEEN 0.85 and 0.88 THEN 0 ELSE

    CASE WHEN Rand(CHECKSUM(NEWID())) BETWEEN 0.88 and 0.92 THEN -1 ELSE

    CASE WHEN Rand(CHECKSUM(NEWID())) BETWEEN 0.92 and 0.97 THEN -7 ELSE

    Round(Rand(CHECKSUM(NEWID())) * -90,0) END END END END,ArrivalDate) AS DaystoReduce

    FROM Bookings

    Thanks

    Wayne

    No offense taken. I was referring to this line:

    Round(Rand(CHECKSUM(NEWID())) * 28,0) END END,ArrivalDate)

    Which I believe will throw some 0s, 2s and 3s into the mix.

    I know a way to generate 15% random cancellations as you say you need, however to do it I need you to provide me with some DDL for the table and some test data in consumable form. It is not something I can just write up without testing and expect to get it right.

    The way I do it might be easier or harder depending on what key fields are available to work with (e.g., if a unique booking number is present).


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply