Help with bulk random code insert

  • Hi

    I need to select some records and reinsert/duplicate them again (not replace them) with a set of 8 digit random codes (a-z/A-Z/0-9) (not necessary to be unique.)

    This is what I am trying to do...

    INSERT INTO tblExample(fname,sname,email,code) SELECT fname,sname,email,generateCode() From tblExample WHERE id=12345

    Does anyone know how to do this? I've found several examples that generate a single code but I can't seem to put them into this query.

  • Before we go about trying to solve this problem... have you considered how many foul words you will spell out with such random codes?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • If that becomes an issue then I can filter later.

  • Here's a quick-and-dirty scalar function to generate a pseudo-random string of UpperCase letters, LowerCase letters and Numbers.

    There's no error checking, and it will give arithmetic overflow errors if you ask for a string longer than 30 characters.

    Replace your call to "GenerateCode()" with "dbo.GenerateCode(rand(),8)" and NEVER tell anyone where you got it! 😉

    drop function dbo.GenerateCode

    go

    create function dbo.GenerateCode(@seed as float, @length tinyint)

    returns varchar(256)

    as

    begin

    declare @code varchar(256) set @code = ''

    declare @l tinyint, @val tinyint

    set @l = @length

    while @l > 0 begin

    set @val = cast((@seed * power(2,@l)) as int) % 62

    set @code = @code +

    case when @val < 10 then cast(@val as char(1))

    when @val < 36 then char(@val+55)

    else char(@val+61)

    end

    set @l = @l - 1

    end

    return @code

    end

    go

    select dbo.GenerateCode(rand(),8)

  • david meagor (6/7/2010)


    If that becomes an issue then I can filter later.

    Heh... better get your filters ready because it will become an issue. It always does.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Ok... you can't spell most swear words without vowels and I really don't want someone to get sued because they didn't filter correctly. Here's an 8 character generator that I made for another post. Warp it to suit your needs. As always, the details are in the comments.

    --===== Conditionally drop the permanent random code and winner tables

    -- Uncomment if you want to test for reruns

    IF OBJECT_ID('dbo.jbmRandomCode','U') IS NOT NULL

    DROP TABLE dbo.jbmRandomCode

    ;

    IF OBJECT_ID('dbo.jbmWinnerCode','U') IS NOT NULL

    DROP TABLE dbo.jbmWinnerCode

    ;

    --===== Generate 1.6 Million random codes where no two adjacent characters

    -- are alike. We also remove all vowels to keep from accidently

    -- spelling swear words and we remove things that could be mistaken

    -- for each other such as:

    -- Capital "O" and "0" (zero)

    -- Lower Case "L" and "1" (one)

    -- Lower Case "o" removed just to avoid all confusion

    -- Jeff Moden

    WITH

    cteBytes AS

    ( --=== Generate 2.2 Million random 8 byte numbers.

    -- We need this many so that when we remove numbers that

    -- have identical adjacent characters and any dupes,

    -- we still have enough to gen 1.6 Million random codes.

    -- This is done at the byte level because NEWID is expensive to run.

    -- If we ran 8 individual NEWID()'s for each code, this code would

    -- take a lot longer.

    SELECT TOP 2200000 CAST(NEWID() AS BINARY(8)) AS Bytes

    FROM Master.sys.All_Columns ac1

    CROSS JOIN Master.sys.All_Columns ac2

    )

    ,

    cteCharacters AS

    ( --=== This is just the list of characters that we have to chose from.

    -- Vowels and certain other characters have been removed to avoid confusion.

    -- Note that both strings must be absolutely identical here.

    SELECT LEN('BCDFGHJKLMNPQRSTVWXYZbcdfghjkmnpqrstvwxyz23456789') AS CharLen,

    'BCDFGHJKLMNPQRSTVWXYZbcdfghjkmnpqrstvwxyz23456789' AS Characters

    )

    ,

    cteChars AS

    ( --=== This takes each byte from the random 8 byte binary number and converts each into

    -- a human readable character by using the random byte to select from the allowed

    -- character list. Note that "modulo" (%) plays an important role here.

    SELECT SUBSTRING(c.Characters,CAST(SUBSTRING(b.Bytes,1,1) AS INT)%c.CharLen+1,1) AS Char1,

    SUBSTRING(c.Characters,CAST(SUBSTRING(b.Bytes,2,1) AS INT)%c.CharLen+1,1) AS Char2,

    SUBSTRING(c.Characters,CAST(SUBSTRING(b.Bytes,3,1) AS INT)%c.CharLen+1,1) AS Char3,

    SUBSTRING(c.Characters,CAST(SUBSTRING(b.Bytes,4,1) AS INT)%c.CharLen+1,1) AS Char4,

    SUBSTRING(c.Characters,CAST(SUBSTRING(b.Bytes,5,1) AS INT)%c.CharLen+1,1) AS Char5,

    SUBSTRING(c.Characters,CAST(SUBSTRING(b.Bytes,6,1) AS INT)%c.CharLen+1,1) AS Char6,

    SUBSTRING(c.Characters,CAST(SUBSTRING(b.Bytes,7,1) AS INT)%c.CharLen+1,1) AS Char7,

    SUBSTRING(c.Characters,CAST(SUBSTRING(b.Bytes,8,1) AS INT)%c.CharLen+1,1) AS Char8

    FROM cteBytes b

    CROSS JOIN cteCharacters c

    )

    ,

    cteRandomCode AS

    ( --=== Now we take all the individual letters and slam them together to make a random code.

    -- In the process, we make sure that no two adjacent characters are alike.

    SELECT Char1+Char2+Char3+Char4+Char5+Char6+Char7+Char8 AS RandomCode

    FROM cteChars

    WHERE Char1<>Char2 AND Char2<>Char3 AND Char3<>Char4 AND Char4<>Char5 AND Char5<> Char6 AND Char6<>Char7 AND Char7<>Char8

    )

    ,

    cteNumberDupes AS

    ( --=== We need to remove dupes and this numbers one or more occurances of the same code

    -- with an incrementing number to count occurances of that same code.

    SELECT ROW_NUMBER() OVER (PARTITION BY RandomCode ORDER BY RandomCode) AS Occurance,

    RandomCode

    FROM cteRandomCode

    ) --=== This selects the top 1.6 million first occurances of all the codes we haven't

    -- rejected so far. Since we only select the first occurance of any given code,

    -- this effectively removes dupes and does so much faster than using DISTINCT.

    -- The final result is stored in a new table called dbo.jbmRandomCode.

    SELECT TOP 1600000

    IDENTITY(INT,1,1) AS RowNum,

    RandomCode COLLATE Latin1_General_BIN AS RandomCode --For case sensitivity

    INTO dbo.jbmRandomCode

    FROM cteNumberDupes

    WHERE Occurance = 1

    ;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks for the replies.

    I used the first script to create a function and have been able to run it using select dbo.GenerateCode(rand(),8)

    however the last three characters are always numeric and don't look very random (often end with the same two digits.) Also it keeps coming up with numbers like 52100000 which seems very unlikely.

    More confusingly I can't seem the new function in the management studio now that it's been created (either in the database it was supposed to be created in or the master database.)

  • hmm.. you're right.

    If you replace the relevant line with:

    set @val = cast((@seed * 63 * power(2,@l)) as int) % 62

    it will give a significantly better distribution of the final couple of characters

  • If you don't care about duplicates and not all letters from A to Z to be used, you can use the following:

    INSERT INTO tblExample(fname,sname,email,code)

    SELECT fname,sname,email,

    LEFT(REPLACE(CAST(NEWID() AS VARCHAR(150)), '-',''),8) AS code

    FROM tblExample WHERE id=12345

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Thanks for your help with this. I went with the last solution as it didn't require stored procedures which I was having issues with and seemed to be the fastest.

    Thanks for your help, much appreciated.

  • You are welcome!

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

Viewing 11 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply