way to generate some random unique sequences

  • Jampandu

    Ten Centuries

    Points: 1012

    Thanks you very much Eiriksson..Excellent solution with good matrix explanation..

    But Need to exclude alphabet letters like I and O those easy to confuse letters with numbers from the series

  • Eirikur Eiriksson

    SSC Guru

    Points: 182523

    Jampandu (11/29/2014)


    Thanks you very much Eiriksson..Excellent solution with good matrix explanation..

    But Need to exclude alphabet letters like I and O those easy to confuse letters with numbers from the series

    Simply add an exclusion of a zero based alphabetical index of the unwanted characters:

    😎

    USE tempdb;

    GO

    SET NOCOUNT ON;

    DECLARE @SAMPLE_SIZE BIGINT = 4569760000;

    ;WITH T(N) AS (SELECT N FROM ( VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) AS X(N))

    , NUMS(N) AS (SELECT TOP(@SAMPLE_SIZE) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N

    FROM T T1,T T2,T T3,T T4,T T5,T T6,T T7,T T8,T T9,T T10,T T11)

    SELECT

    NM.N

    ,NM.N % 4569760000

    ,CONCAT(

    CHAR(97 + FLOOR( (NM.N % 4569760000) / 175760000))

    , CHAR(97 + FLOOR( ((NM.N % 4569760000) % 175760000) / 6760000))

    , CHAR(45)

    , CHAR(97 + FLOOR( (((NM.N % 4569760000) % 175760000) % 6760000) / 260000))

    , FLOOR( ((((NM.N % 4569760000) % 175760000) % 6760000) % 260000) / 26000)

    , FLOOR( (((((NM.N % 4569760000) % 175760000) % 6760000) % 260000) % 26000) / 2600)

    , CHAR(45)

    , CHAR(97 + FLOOR( ((((((NM.N % 4569760000) % 175760000) % 6760000) % 260000) % 26000) % 2600) / 100))

    , FLOOR((((((((NM.N % 4569760000) % 175760000) % 6760000) % 260000) % 26000) % 2600) % 100) / 10)

    , FLOOR((((((((NM.N % 4569760000) % 175760000) % 6760000) % 260000) % 26000) % 2600) % 100) % 10)

    )

    FROM NUMS NM

    WHERE FLOOR( (NM.N % 4569760000) / 175760000) NOT IN (8,14)

    AND FLOOR( ((NM.N % 4569760000) % 175760000) / 6760000) NOT IN (8,14)

    AND FLOOR( (((NM.N % 4569760000) % 175760000) % 6760000) / 260000) NOT IN (8,14)

    AND FLOOR( ((((((NM.N % 4569760000) % 175760000) % 6760000) % 260000) % 26000) % 2600) / 100) NOT IN (8,14)

  • Jampandu

    Ten Centuries

    Points: 1012

    Thanks Eiriksson .. I need to add this logic in computed column along with bigint identity(5,5). will try to exclude alphabet letters like I and O while computing ..

  • Eirikur Eiriksson

    SSC Guru

    Points: 182523

    Jampandu (11/29/2014)


    Thanks Eiriksson .. I need to add this logic in computed column along with bigint identity(5,5). will try to exclude alphabet letters like I and O while computing ..

    Then in your formula, add a "jump" (see the matrix for the jump increments) to the incoming numerical sequence instead of the filtering.

    😎

  • Eirikur Eiriksson

    SSC Guru

    Points: 182523

    Quick thought, since the alpha and the numerics have fixed places in the sequence construct/pattern, why not include all 26 alphas instead of degrading the capacity of the sequence to quadrivigesimal/base 24?

    😎

  • aaron.reese

    SSChampion

    Points: 13415

    @Erickur,

    Excellent explanation of how, but some of us were questioning why?

    Why would you create a unique key that had no relevance to the business. There are only reason that makes sense to me.

    The first is to provide some sort of check digit validation that the number entered is valid and the OP has not requested that as a feature.

    The second is that you want to enforce uniqueness across a number of different tables and still provide a single field FK. Previous to 2012 the only options were identity fields which are not unique across tables or GUIDs which are not sequential. SEQUENCE satisfies both the requirement of database uniqueness and incrementability.

    There is a good blog post here

  • aaron.reese

    SSChampion

    Points: 13415

    Luis Cazares (11/28/2014)


    Jampandu,

    I've never understood the advantage of sequences. I feel that it's only included to be "compatible" with Oracle.

    The first option that you mention seems to me as bad db design. If you're going to have different tables for each client, then you should have separate databases for them. Any code would need to be dynamic or repeated. Imaging having 100 clients using the same db. That only seems like a nightmare.

    The second option has the same disadvantages as identity columns. A rollback will create gaps in the sequence and you might not notice in time to correct it.

    Sequences allow you to ensure that each record in the database has a different number so there is no possibilty of joining the wrong tables by using the wrong FK reference. If you know the SEQ number, you can scan all of the tables in the db and know that you have found the right record.

    I never suggested separate tables for each client; what I meant was that you if you had a 3NF relationship between client -< cusotmer -<order, then you can have one orders table for all the orders and know that a client can only get at his customers and orders because of the way the SEQ number enforces complete relational segregation. You can do it with IDs but if you incorrectly join two tables, you can jump clients; this is just physically impossible if every record in the entire db has a different ID number.

  • Jampandu

    Ten Centuries

    Points: 1012

    Eirikur Eiriksson (11/29/2014)


    Jampandu (11/29/2014)


    Thanks Eiriksson .. I need to add this logic in computed column along with bigint identity(5,5). will try to exclude alphabet letters like I and O while computing ..

    Then in your formula, add a "jump" (see the matrix for the jump increments) to the incoming numerical sequence instead of the filtering.

    😎

    Hi Eriksson,

    As per business requirement this sequence should not contain I and o those easy to confuse letters with numbers for the users.

    I tried very hard to exclude these letters for computed column.but I could not make the right formula.

    It would be good if you give me a query/formula to exclude I and o while computing.

    Thanks

  • Jampandu

    Ten Centuries

    Points: 1012

    Luis/Erickur,

    Instead of computing every time, I am planing to insert all the unique sequences into one look up table.

    CREATE TABLE Lkup_UniqueSequenceValues

    (

    Id bigint identity(1,1) primary key

    ,SequenceNumber bigint

    ,UniqueSequence varchar(15)

    )

    Sample values

    Id SequenceNumber UniqueSequence

    1 1 aa-a00-a01

    2 2 aa-a00-a02

    3 3 aa-a00-a03

    4 4 aa-a00-a04

    5 5 aa-a00-a05

    6 6 aa-a00-a06

    I have created scalar valued function to return UniqueSequence for given Id from Lkup_UniqueSequenceValues tables.

    then I will use this scalar valued function for transaction table's computed column against to primary key identity column.

    Please suggest if there is some other ways which performs better than above mentioned approach.

  • Eirikur Eiriksson

    SSC Guru

    Points: 182523

    Jampandu (11/29/2014)


    Luis/Erickur,

    Instead of computing every time, I am planing to insert all the unique sequences into one look up table.

    CREATE TABLE Lkup_UniqueSequenceValues

    (

    Id bigint identity(1,1) primary key

    ,SequenceNumber bigint

    ,UniqueSequence varchar(15)

    )

    Sample values

    Id SequenceNumber UniqueSequence

    1 1 aa-a00-a01

    2 2 aa-a00-a02

    3 3 aa-a00-a03

    4 4 aa-a00-a04

    5 5 aa-a00-a05

    6 6 aa-a00-a06

    I have created scalar valued function to return UniqueSequence for given Id from Lkup_UniqueSequenceValues tables.

    then I will use this scalar valued function for transaction table's computed column against to primary key identity column.

    Please suggest if there is some other ways which performs better than above mentioned approach.

    It somewhat depends on the usage, mind you that the lookup table approach can place quite a strain on the IO system.

    😎

    Although the mathematical solution is relatively straight forward, T-SQL is not the best language for mathematical expressions. Here is a quick solution based on a combination of a set based and an arithmetic approach.

    First a code that generates a fully populated set:

    USE tempdb;

    GO

    SET NOCOUNT ON;

    /*

    Quadrivigesimal(Base 24) / Decimal Matrix

    Max Value a a a 0 0 a 0 0 Significance

    | (24)(24)(24)(10)(10)(24)(10)(10)

    | | | | | | | | |

    | | | | | | | | '--> 1

    | | | | | | | '------> 10

    | | | | | | '----------> 100

    | | | | | '--------------> 2400

    | | | | '------------------> 24000

    | | | '----------------------> 240000

    | | '--------------------------> 5760000

    | '------------------------------> 138240000

    '----------------------------------------> 3317760000

    */

    DECLARE @SAMPLE_SIZE BIGINT = 3317760000;

    ;WITH T(N) AS (SELECT N FROM ( VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) AS X(N))

    , NUMS(N) AS (SELECT TOP(@SAMPLE_SIZE) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N

    FROM T T1,T T2,T T3,T T4,T T5,T T6,T T7,T T8,T T9,T T10,T T11)

    ,BASE_TWENTYFOUR(IX,CHR) AS (SELECT * FROM (VALUES

    ( 0,'A'),( 1,'B'),( 2,'C'),( 3,'D'),( 4,'E'),( 5,'F'),( 6,'G'),( 7,'H')

    ,( 8,'J'),( 9,'K'),(10,'L'),(11,'M'),(12,'N'),(13,'P'),(14,'Q'),(15,'R')

    ,(16,'S'),(17,'T'),(18,'U'),(19,'V'),(20,'W'),(21,'X'),(22,'Y'),(23,'Z')

    ) AS X(IX,CHR)

    )

    ,SEQUENCE_MATRIX AS

    (

    SELECT

    NM.N

    ,FLOOR((NM.N / 138240000)) AS POS_1

    ,FLOOR((NM.N % 138240000) / 5760000) AS POS_2

    ,FLOOR((NM.N % 5760000) / 240000) AS POS_3

    ,FLOOR((NM.N % 240000) / 24000) AS POS_4

    ,FLOOR((NM.N % 24000) / 2400) AS POS_5

    ,FLOOR((NM.N % 2400) / 100) AS POS_6

    ,FLOOR((NM.N % 100) / 10) AS POS_7

    ,FLOOR((NM.N % 100) % 10) AS POS_8

    FROM NUMS NM

    )

    SELECT

    SM.N

    ,CONCAT

    (

    B1.CHR

    ,B2.CHR

    ,CHAR(45)

    ,B3.CHR

    ,SM.POS_4

    ,SM.POS_5

    ,CHAR(45)

    ,B6.CHR

    ,SM.POS_7

    ,SM.POS_8

    ) AS SEQ_VALUE

    FROM SEQUENCE_MATRIX SM

    INNER JOIN BASE_TWENTYFOUR B1 ON SM.POS_1 = B1.IX

    INNER JOIN BASE_TWENTYFOUR B2 ON SM.POS_2 = B2.IX

    INNER JOIN BASE_TWENTYFOUR B3 ON SM.POS_3 = B3.IX

    INNER JOIN BASE_TWENTYFOUR B6 ON SM.POS_6 = B6.IX;

    A scalar function using the same logic

    USE tempdb;

    GO

    SET NOCOUNT ON;

    IF OBJECT_ID('dbo.SVFN_GEN_B24_SEQUENCE_VALUE') IS NOT NULL DROP FUNCTION dbo.SVFN_GEN_B24_SEQUENCE_VALUE;

    GO

    CREATE FUNCTION dbo.SVFN_GEN_B24_SEQUENCE_VALUE

    (

    @DECIMAL_NUM BIGINT

    )

    RETURNS CHAR(10)

    AS

    BEGIN

    RETURN (SELECT SEQ_STR FROM (

    SELECT

    CONCAT

    (

    B1.CHR

    ,B2.CHR

    ,CHAR(45)

    ,B3.CHR

    ,SM.POS_4

    ,SM.POS_5

    ,CHAR(45)

    ,B6.CHR

    ,SM.POS_7

    ,SM.POS_8

    ) AS SEQ_STR

    FROM

    (

    SELECT

    FLOOR((@DECIMAL_NUM / 138240000)) AS POS_1

    ,FLOOR((@DECIMAL_NUM % 138240000) / 5760000) AS POS_2

    ,FLOOR((@DECIMAL_NUM % 5760000) / 240000) AS POS_3

    ,FLOOR((@DECIMAL_NUM % 240000) / 24000) AS POS_4

    ,FLOOR((@DECIMAL_NUM % 24000) / 2400) AS POS_5

    ,FLOOR((@DECIMAL_NUM % 2400) / 100) AS POS_6

    ,FLOOR((@DECIMAL_NUM % 100) / 10) AS POS_7

    ,FLOOR((@DECIMAL_NUM % 100) % 10) AS POS_8

    ) AS SM

    INNER JOIN

    (SELECT * FROM (VALUES

    ( 0,'A'),( 1,'B'),( 2,'C'),( 3,'D'),( 4,'E'),( 5,'F'),( 6,'G'),( 7,'H')

    ,( 8,'J'),( 9,'K'),(10,'L'),(11,'M'),(12,'N'),(13,'P'),(14,'Q'),(15,'R')

    ,(16,'S'),(17,'T'),(18,'U'),(19,'V'),(20,'W'),(21,'X'),(22,'Y'),(23,'Z')

    ) AS X(IX,CHR)

    ) B1 ON SM.POS_1 = B1.IX

    INNER JOIN

    (SELECT * FROM (VALUES

    ( 0,'A'),( 1,'B'),( 2,'C'),( 3,'D'),( 4,'E'),( 5,'F'),( 6,'G'),( 7,'H')

    ,( 8,'J'),( 9,'K'),(10,'L'),(11,'M'),(12,'N'),(13,'P'),(14,'Q'),(15,'R')

    ,(16,'S'),(17,'T'),(18,'U'),(19,'V'),(20,'W'),(21,'X'),(22,'Y'),(23,'Z')

    ) AS X(IX,CHR)

    ) B2 ON SM.POS_2 = B2.IX

    INNER JOIN

    (SELECT * FROM (VALUES

    ( 0,'A'),( 1,'B'),( 2,'C'),( 3,'D'),( 4,'E'),( 5,'F'),( 6,'G'),( 7,'H')

    ,( 8,'J'),( 9,'K'),(10,'L'),(11,'M'),(12,'N'),(13,'P'),(14,'Q'),(15,'R')

    ,(16,'S'),(17,'T'),(18,'U'),(19,'V'),(20,'W'),(21,'X'),(22,'Y'),(23,'Z')

    ) AS X(IX,CHR)

    ) B3 ON SM.POS_3 = B3.IX

    INNER JOIN

    (SELECT * FROM (VALUES

    ( 0,'A'),( 1,'B'),( 2,'C'),( 3,'D'),( 4,'E'),( 5,'F'),( 6,'G'),( 7,'H')

    ,( 8,'J'),( 9,'K'),(10,'L'),(11,'M'),(12,'N'),(13,'P'),(14,'Q'),(15,'R')

    ,(16,'S'),(17,'T'),(18,'U'),(19,'V'),(20,'W'),(21,'X'),(22,'Y'),(23,'Z')

    ) AS X(IX,CHR)

    ) B6 ON SM.POS_6 = B6.IX

    ) AS X

    )

    END

    An inline table value function using the same logic

    USE tempdb;

    GO

    SET NOCOUNT ON;

    IF OBJECT_ID('dbo.ITVFN_GEN_B24_SEQUENCE_VALUE') IS NOT NULL DROP FUNCTION dbo.ITVFN_GEN_B24_SEQUENCE_VALUE;

    GO

    CREATE FUNCTION dbo.ITVFN_GEN_B24_SEQUENCE_VALUE

    (

    @DECIMAL_NUM BIGINT

    )

    RETURNS TABLE

    AS

    RETURN

    WITH BASE_TWENTYFOUR(IX,CHR) AS (SELECT * FROM (VALUES

    ( 0,'A'),( 1,'B'),( 2,'C'),( 3,'D'),( 4,'E'),( 5,'F'),( 6,'G'),( 7,'H')

    ,( 8,'J'),( 9,'K'),(10,'L'),(11,'M'),(12,'N'),(13,'P'),(14,'Q'),(15,'R')

    ,(16,'S'),(17,'T'),(18,'U'),(19,'V'),(20,'W'),(21,'X'),(22,'Y'),(23,'Z')

    ) AS X(IX,CHR)

    )

    ,SEQUENCE_MATRIX AS

    (

    SELECT

    FLOOR((@DECIMAL_NUM / 138240000)) AS POS_1

    ,FLOOR((@DECIMAL_NUM % 138240000) / 5760000) AS POS_2

    ,FLOOR((@DECIMAL_NUM % 5760000) / 240000) AS POS_3

    ,FLOOR((@DECIMAL_NUM % 240000) / 24000) AS POS_4

    ,FLOOR((@DECIMAL_NUM % 24000) / 2400) AS POS_5

    ,FLOOR((@DECIMAL_NUM % 2400) / 100) AS POS_6

    ,FLOOR((@DECIMAL_NUM % 100) / 10) AS POS_7

    ,FLOOR((@DECIMAL_NUM % 100) % 10) AS POS_8

    )

    SELECT

    CONCAT

    (

    B1.CHR

    ,B2.CHR

    ,CHAR(45)

    ,B3.CHR

    ,SM.POS_4

    ,SM.POS_5

    ,CHAR(45)

    ,B6.CHR

    ,SM.POS_7

    ,SM.POS_8

    ) AS SEQ_STR

    FROM SEQUENCE_MATRIX SM

    INNER JOIN BASE_TWENTYFOUR B1 ON SM.POS_1 = B1.IX

    INNER JOIN BASE_TWENTYFOUR B2 ON SM.POS_2 = B2.IX

    INNER JOIN BASE_TWENTYFOUR B3 ON SM.POS_3 = B3.IX

    INNER JOIN BASE_TWENTYFOUR B6 ON SM.POS_6 = B6.IX;

    Sample usage of the scalar function

    USE tempdb;

    GO

    SET NOCOUNT ON;

    DECLARE @SAMPLE_SIZE BIGINT = 1000000;--3317760000;

    ;WITH T(N) AS (SELECT N FROM ( VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) AS X(N))

    , NUMS(N) AS (SELECT TOP(@SAMPLE_SIZE) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N

    FROM T T1,T T2,T T3,T T4,T T5,T T6,T T7,T T8,T T9,T T10,T T11)

    SELECT

    NM.N

    ,dbo.SVFN_GEN_B24_SEQUENCE_VALUE(NM.N) AS SEQ_STR

    FROM NUMS NM;

    Sample usage of the table value function

    USE tempdb;

    GO

    SET NOCOUNT ON;

    DECLARE @SAMPLE_SIZE BIGINT = 1000000;--3317760000;

    ;WITH T(N) AS (SELECT N FROM ( VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) AS X(N))

    , NUMS(N) AS (SELECT TOP(@SAMPLE_SIZE) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N

    FROM T T1,T T2,T T3,T T4,T T5,T T6,T T7,T T8,T T9,T T10,T T11)

    SELECT

    NM.N

    ,SQ.SEQ_STR

    FROM NUMS NM

    CROSS APPLY dbo.ITVFN_GEN_B24_SEQUENCE_VALUE(NM.N) AS SQ;

  • Jampandu

    Ten Centuries

    Points: 1012

    Thank you very much Eiriksson and Luis for your valuable solutions.

  • aaron.reese

    SSChampion

    Points: 13415

    @Jampandu

    I am still not clear WHY you want a unique number that is not the Identity column but fulfils the same purpose. By putting the values in a lookup table you have simply moved the Identity to a separate table (with the associated cost of joins and storage)

    Who has told you that you need a 'business friendly' record identifier in the defined format because I don't think they fully understand third normal form and the different between identity, primary, foreign and business keys. Identity keys are internal record pointers and should not be displayed to , or used by, the end users; but replacing it with some other meaningless code is creating long term potential issues. IF you wanted to include some sort of validation number in the code (luhn check digit for example) that meant that the code would be invalid if the user transposes two characters then I could understand, but all you are doing is creating extra work the the database administrators, front end application developers and for users who now have to record a meaningless string of characters so appease some random decision made by someone who doesn't understand what they are doing.

    I assume from the questions you have asked that you are not in control of this project and have been TOLD to do it this way, but you really need to push back on the requirement as I cannot see any context in which it makes sense from a data management perspective

Viewing 12 posts - 16 through 27 (of 27 total)

You must be logged in to reply to this topic. Login to reply