way to generate some random unique sequences

  • Hi Experts,

    I need a way to generate some random unique sequences for one of the column (+5 increment for each row)

    sequences as below..

    A00-A05

    A00-A10

    A00-A15

    …………..

    …………..

    …………..

    A00-A95

    A00-B00

    A00-B05

    …………..

    …………..

    …………..

    A00-Z95

    A01-A00

    A01-A05

    …………..

    …………..

    …………..

    …………..

    Z99-Z95

    NOTE : Need to exclude alphabet letters like I and O those easy to confuse letters with numbers should not be used in series sequences

    Please give me some suggestions or script to implement this.

    we are planning to use above series as primary key instead of identity column.

    Thanks in advance.

  • Jampandu (11/26/2014)


    Hi Experts,

    I need a way to generate some random unique sequences for one of the column (+5 increment for each row)

    sequences as below..

    A00-A05

    A00-A10

    A00-A15

    …………..

    …………..

    …………..

    A00-A95

    A00-B00

    A00-B05

    …………..

    …………..

    …………..

    A00-Z95

    A01-A00

    A01-A05

    …………..

    …………..

    …………..

    …………..

    Z99-Z95

    NOTE : Need to exclude alphabet letters like I and O those easy to confuse letters with numbers should not be used in series sequences

    Please give me some suggestions or script to implement this.

    we are planning to use above series as primary key instead of identity column.

    Thanks in advance.

    Quick suggestion, this is straight forward using the CHAR() function in combination with the ROW_NUMBER and Modulo (%) operator. My question is why doing this?

    😎

  • I'm sorry, but I have to ask it – why would you want to do something like that? I don't see any advantage of using a value that is generated like that over using identity or sequence. I can only see disadvantages of managing it. In both cases the value has no real meaning to the data and it is only there to uniquely identify the row, so why use something complicate instead of something very simple?

    Adi

    --------------------------------------------------------------
    To know how to ask questions and increase the chances of getting asnwers:
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • I would keep an identity with seed and increment of 5. Based on that identity column, you could calculate the needed sequence. Note that with this formula, you can only have 2,447,593 unique values.

    Example:

    CREATE TABLE SequenceTest(

    intid int IDENTITY( 5, 5),

    calcid AS CASE WHEN ((intid / 480000) % 24) >= 13 THEN CHAR( 67 + (intid / 480000) % 24)

    WHEN ((intid / 480000) % 24) >= 8 THEN CHAR( 66 + (intid / 480000) % 24)

    ELSE CHAR( 65 + (intid / 480000) % 24) END +

    RIGHT( 100 + ((intid / 100) / 24), 2) + '-' +

    CASE WHEN ((intid / 100) % 24) >= 13 THEN CHAR( 67 + (intid / 100) % 24)

    WHEN ((intid / 100) % 24) >= 8 THEN CHAR( 66 + (intid / 100) % 24)

    ELSE CHAR( 65 + (intid / 100) % 24) END +

    RIGHT( 100 + intid, 2),

    dummy varchar(10));

    INSERT INTO SequenceTest

    SELECT TOP 10000000 'x'

    FROM sys.all_columns, sys.all_columns b;

    SELECT *

    FROM SequenceTest;

    GO

    DROP TABLE SequenceTest

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • By having the structure you're asking for, along with a sequence, doesn't that make it no longer random?

    I've done something similar on a smaller scale to get every 3 character alphanumeric permutation using cross joins. That was pretty quick, but as you want it as a primary key rather than some sort of lookup table you wouldn't want to be dragging around some huge table with every combination

  • Jampandu (11/26/2014)


    we are planning to use above series as primary key instead of identity column.

    I didn't read this part. Using this as a primary key won't give you any benefits. You'll need 7 bytes instead of 4 used by the int and you'll reduce the number of values available. If you use it as an alternate key just to have a "more readable" key, you can keep it as a calculated column that won't use any space and you don't have to use it as foreign key in any other tables.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Thanks Luis , sorry for the confusion.

    As you all suggested we can go with identity columns as a primary key and computed column for unique sequence numbers.

  • Luis Cazares (11/26/2014)


    I would keep an identity with seed and increment of 5. Based on that identity column, you could calculate the needed sequence. Note that with this formula, you can only have 2,447,593 unique values.

    [/code]

    Hi Luis,

    by considering your point (2,447,593 unique values) we decided to add 2 letter index before hand

    like this aa-a00-a05 .... aa-z99-z95 ...ab-a00-a05 ....zz-z99-z95

    That would give us around 1,654,572,868 possible values.More than we would ever possibly use.

    Can you please give a query for this.

    It will take some time for me to understand your query 🙂 .

    Thanks,

    Jampandu

  • You have posted thin on the SQL2008 board, but there is another solution if you are on SQL2012. There is a new option of SEQUENCE where you can define the identity column to get its number from a SEQUENCE key rather than the next number in the identity incrementer. this means that you can set every record in every table to have a unique ID number but it is functionally better than GUIDs because they are issued sequentially.

    I can see two areas where they would be useful. one is if you have a database that is shared by different clients as it forces complete separation of the datasets without having to put clientID everywhere. The second would be for a SARBOX type audit where you would be able to prove that every ID number existed in the database (and therefore no records have been deleted) and that related records were written in the correct order because of the way their ID numbers were allocated.

  • we are using sql 2014. we can implement the solution using sequences

  • Glad to be of help.

    If you go with this option, understand that it is NOT ANSI standard SQL so you will be commmitted to the MSSQL solution or need some heavy lifting to migrate to another platform.

    also it is not standard practice within the MSSQL user base so you will need to make very sure that everyone knows NOT to set Identity(1,1)

  • Jampandu,

    Adding 2 letters means that I'll have to do additional testing and I have some urgent work to do. Try understanding the code and the solution should become evident.

    Aaron,

    I've never understood the advantage of sequences. I feel that it's only included to be "compatible" with Oracle.

    The first option that you mention seems to me as bad db design. If you're going to have different tables for each client, then you should have separate databases for them. Any code would need to be dynamic or repeated. Imaging having 100 clients using the same db. That only seems like a nightmare.

    The second option has the same disadvantages as identity columns. A rollback will create gaps in the sequence and you might not notice in time to correct it.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Jampandu (11/28/2014)


    we are using sql 2014. we can implement the solution using sequences

    How? Sequences do not allow apha-numeric data. They can only be of these types:

    tinyint - Range 0 to 255

    smallint - Range -32,768 to 32,767

    int - Range -2,147,483,648 to 2,147,483,647

    bigint - Range -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807

    decimal and numeric with a scale of 0

    Gerald Britton, Pluralsight courses

  • g.britton (11/28/2014)


    Jampandu (11/28/2014)


    we are using sql 2014. we can implement the solution using sequences

    How? Sequences do not allow apha-numeric data. They can only be of these types:

    tinyint - Range 0 to 255

    smallint - Range -32,768 to 32,767

    int - Range -2,147,483,648 to 2,147,483,647

    bigint - Range -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807

    decimal and numeric with a scale of 0

    I thought "aaron.reese"is going to provide a solution using SEQUENCES.

  • The sequence asked for is a combination sequence of alpha (bijective) hexavigesimal(base 26) and numerical decimal (base 10). The pattern proposed has eight significant digits with the maximum value of zz-z99-z99 = (26)(26)(26)(10)(10)(26)(10)(10) = 4.56976 x 10^9 = 4569760000.

    😎

    Constructing the sequence is relatively straight forward, first consider this table:

    Max Value a a a 0 0 a 0 0 Significance

    | (26)(26)(26)(10)(10)(26)(10)(10)

    | | | | | | | | |

    | | | | | | | | '--> 1

    | | | | | | | '------> 10

    | | | | | | '----------> 100

    | | | | | '--------------> 2600

    | | | | '------------------> 26000

    | | | '----------------------> 260000

    | | '--------------------------> 6760000

    | '------------------------------> 175760000

    '----------------------------------------> 4569760000

    Using this matrix, we can now set the iteration for each digit in the sequence with a combination of division and modulo operation, this code samples use an inline Tally CTE to construct exactly one fully populated set, the first sample is for SQL Server 2012 and later, the latter sample is for SQL Server 2005 - 2008. Note that the code is left very verbose to make it self explanatory:

    SQL Server 2012 - 1014

    USE tempdb;

    GO

    SET NOCOUNT ON;

    DECLARE @SAMPLE_SIZE BIGINT = 4569760000;

    ;WITH T(N) AS (SELECT N FROM ( VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) AS X(N))

    , NUMS(N) AS (SELECT TOP(@SAMPLE_SIZE) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N

    FROM T T1,T T2,T T3,T T4,T T5,T T6,T T7,T T8,T T9,T T10,T T11)

    SELECT

    NM.N

    ,NM.N % 4569760000

    ,CONCAT(

    CHAR(97 + FLOOR( (NM.N % 4569760000) / 175760000))

    , CHAR(97 + FLOOR( ((NM.N % 4569760000) % 175760000) / 6760000))

    , CHAR(45)

    , CHAR(97 + FLOOR( (((NM.N % 4569760000) % 175760000) % 6760000) / 260000))

    , FLOOR( ((((NM.N % 4569760000) % 175760000) % 6760000) % 260000) / 26000)

    , FLOOR( (((((NM.N % 4569760000) % 175760000) % 6760000) % 260000) % 26000) / 2600)

    , CHAR(45)

    , CHAR(97 + FLOOR( ((((((NM.N % 4569760000) % 175760000) % 6760000) % 260000) % 26000) % 2600) / 100))

    , FLOOR((((((((NM.N % 4569760000) % 175760000) % 6760000) % 260000) % 26000) % 2600) % 100) / 10)

    , FLOOR((((((((NM.N % 4569760000) % 175760000) % 6760000) % 260000) % 26000) % 2600) % 100) % 10)

    )

    FROM NUMS NM;

    SQL Server 2005 - 2008

    USE tempdb;

    GO

    SET NOCOUNT ON;

    DECLARE @SAMPLE_SIZE BIGINT = 4569760000;

    ;WITH T(N) AS (SELECT N FROM ( VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) AS X(N))

    , NUMS(N) AS (SELECT TOP(@SAMPLE_SIZE) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N FROM T T1,T T2,T T3,T T4,T T5,T T6,T T7,T T8,T T9,T T10,T T11)

    SELECT

    NM.N

    ,NM.N % 4569760000

    , CHAR(97 + FLOOR( (NM.N % 4569760000) / 175760000) )

    + CHAR(97 + FLOOR( ((NM.N % 4569760000) % 175760000) / 6760000) )

    + CHAR(45)

    + CHAR(97 + FLOOR( (((NM.N % 4569760000) % 175760000) % 6760000) / 260000) )

    + CAST( FLOOR( ((((NM.N % 4569760000) % 175760000) % 6760000) % 260000) / 26000) AS CHAR(1))

    + CAST( FLOOR( (((((NM.N % 4569760000) % 175760000) % 6760000) % 260000) % 26000) / 2600) AS CHAR(1))

    + CHAR(45)

    + CHAR(97 + FLOOR( ((((((NM.N % 4569760000) % 175760000) % 6760000) % 260000) % 26000) % 2600) / 100) )

    + CAST( FLOOR((((((((NM.N % 4569760000) % 175760000) % 6760000) % 260000) % 26000) % 2600) % 100) / 10) AS CHAR(1))

    + CAST( FLOOR((((((((NM.N % 4569760000) % 175760000) % 6760000) % 260000) % 26000) % 2600) % 100) % 10) AS CHAR(1))

    FROM NUMS NM;

    The same matrix as used for constructing the sequence can then also be used to translate it back to it's numerical equivalent:

    USE tempdb;

    GO

    SET NOCOUNT ON;

    DECLARE @SEQ_STR CHAR(10) = 'ac-k97-g27';

    SELECT

    CONVERT(BIGINT,ASCII(SUBSTRING(@SEQ_STR,1, 1)) - 97 ,0) * 175760000

    + CONVERT(BIGINT,ASCII(SUBSTRING(@SEQ_STR,2, 1)) - 97 ,0) * 6760000

    + CONVERT(BIGINT,ASCII(SUBSTRING(@SEQ_STR,4, 1)) - 97 ,0) * 260000

    + CONVERT(BIGINT, SUBSTRING(@SEQ_STR,5, 1) ,0) * 26000

    + CONVERT(BIGINT, SUBSTRING(@SEQ_STR,6, 1) ,0) * 2600

    + CONVERT(BIGINT,ASCII(SUBSTRING(@SEQ_STR,8, 1)) - 97 ,0) * 100

    + CONVERT(BIGINT, SUBSTRING(@SEQ_STR,9, 1) ,0) * 10

    + CONVERT(BIGINT, SUBSTRING(@SEQ_STR,10,1) ,0);

Viewing 15 posts - 1 through 15 (of 26 total)

You must be logged in to reply to this topic. Login to reply