Update values with random values

  • I have a table with about 100000 records and I need to take one of those fields and

    update the values within 1-5% negative or positive (randomly)

    select values from table

    Value          +/-         Random n% between 1 and 5

    14654654     +             n% = new#

    Update values w/ new#

    Repeat with next value

  • Try:

    update mytable

    set myvalue = myvalue + CAST(RAND(CAST(NEWID() AS VARBINARY))*5+1 AS INT)

    James.

    Oops, guess that won't produce any negative numbers.  Sorry.

  • Ok, if you have to have negative numbers you could try this:

    update mytable

    set myvalue = myvalue +

                  case when RAND(CAST(NEWID() AS VARBINARY)) > 0.50

                       then CAST(RAND(CAST(NEWID() AS VARBINARY))*-5+1 AS INT)

                       else CAST(RAND(CAST(NEWID() AS VARBINARY))*5+1 AS INT)

                  end

    James.

     

  • SELECT

    1 + (CHECKSUM(NEWID()) % 5) / 100.0

     


    N 56°04'39.16"
    E 12°55'05.25"

  • Hey Peter, nice and neat, but you don't need the last " / 100.00".  It will give the -5 to +5 range without it.  At least when I tested it on my computer 🙂

    select top 1000  1 + (CHECKSUM(NEWID()) % 5)

       FROM Master.dbo.SysColumns sc1,

            Master.dbo.SysColumns sc2

    That is much nicer than the long winded solution I had. 

    James.

  • Close Peter... and nicely done, but I believe you need this instead...

    SELECT 1 + (CHECKSUM(NEWID()) % 6) / 100.0

    ...yours gave a range of .96 to 1.04... above gives range of .95 to 1.05

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Percent James... he needs the /100 to turn it into the decimal equivelent of percent so it can be used as a multiplier...

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks Jeff (sorry Peter).  I read it several times but my brain just wouldn't interpret it that way.  Must have been a senior moment.  I just re-read it and I believe you are right, though the wording and example just don't "click" 100% in the old noggin even now.

    James.

  • Heh... no problem James... short naps are what get me through the day

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Peter, I gotta say it again... nice job... I was a bit concerned about distribution of the numbers so I compared your method against the traditional method... both have the same +/- distribution and your's is about twice as fast as well as being a bit more intuitive... thanks for the tip

    --===== Peter's method of generating random numbers from -5 to +5

     SELECT SUM(CASE WHEN r >0 THEN 1 ELSE 0 END)/5.0 AS GTZero,

            SUM(CASE WHEN r <0 THEN 1 ELSE 0 END)/5.0 AS LTZero,

            SUM(CASE WHEN r =0 THEN 1 ELSE 0 END)/2.0 AS EQZero,

            MIN(r) AS MinVal,

            MAX(r) AS MaxVal

       FROM (

             SELECT TOP 10000

                    (CHECKSUM(NEWID()) % 6 ) r

               FROM Master.dbo.SysColumns sc1,

                    Master.dbo.SysColumns sc2

            )d

    --===== Traditional method of generating random numbers from -5 to +5

     SELECT SUM(CASE WHEN r >0 THEN 1 ELSE 0 END)/5.0 AS GTZero,

            SUM(CASE WHEN r <0 THEN 1 ELSE 0 END)/5.0 AS LTZero,

            SUM(CASE WHEN r =0 THEN 1 ELSE 0 END)/2.0 AS EQZero,

            MIN(r) AS MinVal,

            MAX(r) AS MaxVal

       FROM (

             SELECT TOP 10000

                    CAST(RAND(CAST(NEWID() AS VARBINARY))*12.0-6 AS INT) r

               FROM Master.dbo.SysColumns sc1,

                    Master.dbo.SysColumns sc2

            )d

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Couldn't resist... simple integer rounding gives zero twice the number of hits using both methods... everything else has a pretty good distirubtion... if anyone wants to take a crack at evening the distribution on zero?  And no *^&%$&^#@! Bankers Rounding, please

    --===== Peter's method of generating random numbers from -5 to +5

     SELECT SUM(CASE WHEN r = -5 THEN 1 ELSE 0 END) AS M5,

            SUM(CASE WHEN r = -4 THEN 1 ELSE 0 END) AS M4,

            SUM(CASE WHEN r = -3 THEN 1 ELSE 0 END) AS M3,

            SUM(CASE WHEN r = -2 THEN 1 ELSE 0 END) AS M2,

            SUM(CASE WHEN r = -1 THEN 1 ELSE 0 END) AS M1,

            SUM(CASE WHEN r =  0 THEN 1 ELSE 0 END) AS ZZ,

            SUM(CASE WHEN r = +1 THEN 1 ELSE 0 END) AS P1,

            SUM(CASE WHEN r = +2 THEN 1 ELSE 0 END) AS P2,

            SUM(CASE WHEN r = +3 THEN 1 ELSE 0 END) AS P3,

            SUM(CASE WHEN r = +4 THEN 1 ELSE 0 END) AS P4,

            SUM(CASE WHEN r = +5 THEN 1 ELSE 0 END) AS P5,

            MIN(r) AS MinVal,

            MAX(r) AS MaxVal

       FROM (

             SELECT TOP 10000

                    (CHECKSUM(NEWID()) % 6 ) r

               FROM Master.dbo.SysColumns sc1,

                    Master.dbo.SysColumns sc2

            )d

    --===== Traditional method of generating random numbers from -5 to +5

     SELECT SUM(CASE WHEN r = -5 THEN 1 ELSE 0 END) AS M5,

            SUM(CASE WHEN r = -4 THEN 1 ELSE 0 END) AS M4,

            SUM(CASE WHEN r = -3 THEN 1 ELSE 0 END) AS M3,

            SUM(CASE WHEN r = -2 THEN 1 ELSE 0 END) AS M2,

            SUM(CASE WHEN r = -1 THEN 1 ELSE 0 END) AS M1,

            SUM(CASE WHEN r =  0 THEN 1 ELSE 0 END) AS ZZ,

            SUM(CASE WHEN r = +1 THEN 1 ELSE 0 END) AS P1,

            SUM(CASE WHEN r = +2 THEN 1 ELSE 0 END) AS P2,

            SUM(CASE WHEN r = +3 THEN 1 ELSE 0 END) AS P3,

            SUM(CASE WHEN r = +4 THEN 1 ELSE 0 END) AS P4,

            SUM(CASE WHEN r = +5 THEN 1 ELSE 0 END) AS P5,

            MIN(r) AS MinVal,

            MAX(r) AS MaxVal

       FROM (

             SELECT TOP 10000

                    CAST(RAND(CAST(NEWID() AS VARBINARY))*12.0-6 AS INT) r

               FROM Master.dbo.SysColumns sc1,

                    Master.dbo.SysColumns sc2

            )d

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hacked a fix for the zero distribution problem on the traditional method...

    --===== Traditional method of generating random numbers from -5 to +5

     SELECT SUM(CASE WHEN r = -5 THEN 1 ELSE 0 END) AS M5,

            SUM(CASE WHEN r = -4 THEN 1 ELSE 0 END) AS M4,

            SUM(CASE WHEN r = -3 THEN 1 ELSE 0 END) AS M3,

            SUM(CASE WHEN r = -2 THEN 1 ELSE 0 END) AS M2,

            SUM(CASE WHEN r = -1 THEN 1 ELSE 0 END) AS M1,

            SUM(CASE WHEN r =  0 THEN 1 ELSE 0 END) AS ZZ,

            SUM(CASE WHEN r = +1 THEN 1 ELSE 0 END) AS P1,

            SUM(CASE WHEN r = +2 THEN 1 ELSE 0 END) AS P2,

            SUM(CASE WHEN r = +3 THEN 1 ELSE 0 END) AS P3,

            SUM(CASE WHEN r = +4 THEN 1 ELSE 0 END) AS P4,

            SUM(CASE WHEN r = +5 THEN 1 ELSE 0 END) AS P5,

            MIN(r) AS MinVal,

            MAX(r) AS MaxVal

       FROM (

             SELECT TOP 11000

                    CAST(RAND(CAST(NEWID() AS VARBINARY))*11 AS INT)-5 r

               FROM Master.dbo.SysColumns sc1,

                    Master.dbo.SysColumns sc2

            )d

    Still trying on Peter's method because it's twice as fast as the traditional method... any ideas, folks?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Heh... also hacked one for Peter's method... slowed it down just a bit but it's still faster than the traditional method...

    --===== Peter's method of generating random numbers from -5 to +5 (distribution problem on 0)

     SELECT SUM(CASE WHEN r = -5 THEN 1 ELSE 0 END) AS M5,

            SUM(CASE WHEN r = -4 THEN 1 ELSE 0 END) AS M4,

            SUM(CASE WHEN r = -3 THEN 1 ELSE 0 END) AS M3,

            SUM(CASE WHEN r = -2 THEN 1 ELSE 0 END) AS M2,

            SUM(CASE WHEN r = -1 THEN 1 ELSE 0 END) AS M1,

            SUM(CASE WHEN r =  0 THEN 1 ELSE 0 END) AS ZZ,

            SUM(CASE WHEN r = +1 THEN 1 ELSE 0 END) AS P1,

            SUM(CASE WHEN r = +2 THEN 1 ELSE 0 END) AS P2,

            SUM(CASE WHEN r = +3 THEN 1 ELSE 0 END) AS P3,

            SUM(CASE WHEN r = +4 THEN 1 ELSE 0 END) AS P4,

            SUM(CASE WHEN r = +5 THEN 1 ELSE 0 END) AS P5,

            MIN(r) AS MinVal,

            MAX(r) AS MaxVal

       FROM (

             SELECT TOP 11000

                    ABS(CHECKSUM(NEWID()))%11-5 r

               FROM Master.dbo.SysColumns sc1,

                    Master.dbo.SysColumns sc2

            )d

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • If you need a higher number of decimal places in your random number generation, this will give you 12 decimal places in the range of 0.95 to 1.05

     

    The code below compares it to the method posted by Peter for a sample of 1 million rows.

    select
     A_Min = convert(decimal(18,13),min(a)),
     A_Max = convert(decimal(18,13),max(a)),
     A_Avg = convert(decimal(18,13),avg(a)),
     B_Min = min(b),
     B_Max = max(b),
     B_Avg = avg(b)
    from
     (
     select top 100 percent
     -- My method
     A = 1+convert(bigint,convert(varbinary(8),newid()))%2000000001/40000000000. ,
      -- Peters method
      B =  1 + (CHECKSUM(NEWID()) % 6) / 100.0 
     from
      (select top 100 percent number
      -- Number table function available on following link:
      -- http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=47685
      from f_table_number_range(1,1000000) )a
     ) aa
    Results:
    A_Min                A_Max                A_Avg                B_Min                B_Max                B_Avg                                    
    -------------------- -------------------- -------------------- -------------------- -------------------- ---------------------------------------- 
    .9500000903750       1.0499999645000      1.0000211895050      .950000              1.050000             .999955
    (1 row(s) affected)
    
     
  • Michael,

    Why don't you write an article here on SQLServerCentral about you number and date range generator functions?  I've always been really impressed with the speed of the darned things...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 26 total)

You must be logged in to reply to this topic. Login to reply