An alternative to GUIDs

  • Comments posted to this topic are about the item An alternative to GUIDs

  • For independent client generation, we had good success with COMBs: http://www.informit.com/articles/printerfriendly.aspx?p=25862

    The problem you mention with waiting on the database for SCOPE_IDENTITY() can be mitigated now (at least for one master child - not for a complex multi-level hierarchy or network) by using table-valued parameters for master/child data. All the data can be submitted cleanly to stored procs and easy use of SCOPE_IDENTITY() and OUTPUT INTO can avoid the round trips for that. This also mitigates any need for external transaction management around the operation from the client side since it becomes atomic from the client's perspective.

  • This is actually very close to the following by Tatham Oddie: http://blog.tatham.oddie.com.au/2011/07/14/released-snowmaker-a-unique-id-generator-for-azure-or-any-other-cloud-hosting-environment/

    or perhaps even more close to the Twitter Unique ID generator he references: https://github.com/twitter/snowflake

    If anyone was going to implement this type of ID generator I'd suggest taking careful note of the comments in the description of the Twitter ID generator that talk about NTP time servers and what can happen if your clocks are wound back automatically. Not a good thing, and actually more likely to occur in a virtual environment where clocks tend to run a little fast!

  • Some time back we needed to achieve a similar goal, but with sales order numbers, which we previously sequential and the "next" value held in a "housekeeping" table along with invoice numbers etc. The existing performance was poor though - worse than IDs - but we still needed some form of sequentiallity, and it to be human readable - and quotable over the phone by customers - therefore GUIDs were definitely not an option

    We changed the SO numbering here to be a long set of letters and numbers that are part random and part sequence (based on time) which means the computer just “makes it up” rather than grabbing the next one.

    What we did was build a 15 digit order number in the following format, using essentially base 30 - built from 30 easily distinguisable characters (23456789ABCDEFGHJKLMNPRTUVWXYZ)

    S123-TYP-QRT-6RE

    First character “S” is for Sales Order (So we can implement for Q-uotes, R-eturns etc.)

    The bit in bold is the sequence part (changes every minute so)

    The bit not in bold is random – with 729000000 possibilities – not much chance of a clash in the 60 seconds.

    Generally we ask people phoning for the last 3 digits (i.e. customers) and you may get 2 or 3 sales orders appearing, but it’s easy to find the right one… The last 6 digits as you can see from the possibilities are HIGHLY unlikely to get duplicates anyhow.

    Here is our VB.NET code (or most of it). It could be easily changed to provide unique IDs that are human readable, nearly sequential, and don't require a "server ID". All of this is also implementable in SQL if you require using stored procedures.

    Function GetSONumber() As String

    Return GetSequenceNumber("S")

    End Function

    Function GetSequenceNumber(ByVal PrefixChar As String) As String

    PrefixChar = Left(PrefixChar + "X", 1) ' in case blank

    Dim Mins As Integer

    Mins = DateDiff(DateInterval.Minute, CDate("2008-01-01"), GetServerTime)

    Return SplitIntoThrees(PrefixChar + ConvertBase30(Mins) + RandLetts(6))

    End Function

    Function RandLetts(ByVal Chars As Integer) As String

    ' letters that can easily be distinguished visually and audibally

    ' need to build routine to convert S to F for example and block others

    Dim V As String = "23456789ABCDEFGHJKLMNPRTUVWXYZ"

    Dim O As String = ""

    For I As Integer = 1 To Chars

    O = O + Mid(V, RollDice(Len(V)), 1)

    Next

    Return O

    End Function

    Function GetServerTime() As Date

    ' taken from http://www.codeproject.com/KB/vb/NetRemoteTOD.aspx

    ' removed daylight saving part as we want UTC

    ' see http://support.microsoft.com/kb/249716 for full usage info

    ' reverts to local PC time if can't connect

    Try

    Dim iRet As Integer

    Dim ptodi As IntPtr

    Dim todi As TIME_OF_DAY_INFO

    Dim dDate As Date

    Dim strServerName As String = "servername.local" & vbNullChar

    iRet = NetRemoteTOD(strServerName, ptodi)

    If iRet = 0 Then

    todi = CType(Marshal.PtrToStructure(ptodi, GetType(TIME_OF_DAY_INFO)), _

    TIME_OF_DAY_INFO)

    NetApiBufferFree(ptodi)

    dDate = DateSerial(todi.tod_year, todi.tod_month, todi.tod_day) + " " + _

    TimeSerial(todi.tod_hours, todi.tod_mins, todi.tod_secs)

    GetServerTime = dDate

    Else

    GetServerTime = Now.ToUniversalTime

    End If

    Catch

    GetServerTime = Now.ToUniversalTime

    End Try

    End Function

    Function SplitIntoThrees(ByVal CharsIn As String) As String

    Dim i As Integer

    Dim O As String = ""

    For i = 1 To Len(CharsIn)

    If (i - 1) Mod 3 = 0 And i <> 1 Then O = O + "-"

    O = O + Mid(CharsIn, i, 1)

    Next i

    SplitIntoThrees = O

    End Function

    Function RollDice(ByVal NumSides As Integer) As Integer

    ' Create a byte array to hold the random value.

    Dim randomNumber(0) As Byte

    ' Create a new instance of the RNGCryptoServiceProvider.

    Dim Gen As New RNGCryptoServiceProvider()

    ' Fill the array with a random value.

    Gen.GetBytes(randomNumber)

    ' Convert the byte to an integer value to make the modulus operation easier.

    Dim rand As Integer = Convert.ToInt32(randomNumber(0))

    ' Return the random number mod the number

    ' of sides. The possible values are zero-

    ' based, so we add one.

    Return rand Mod NumSides + 1

    End Function

  • nice idea,

    but my concern would be clocks rolling forward/backard.

    As stated above if its a Enterprise app really the whole transaction set should be passed to an SP as one atomic transaction removing the round trip/Transaction handling.

    Sounds like not very good data centre management, if A fails to B and then B wants to pass control back to A what happens to the the orders taken at B. They should be persisted back to A anyhow.

  • Quote: "What happens after 40 years (well actually nearer 60 years)? Well, it will be job security for my great-grandchildren so I'm not going to worry about it."

    I thought we IT guys had learned our lesson after the millenium bug ...

  • Pascal Declercq (9/15/2011)


    Quote: "What happens after 40 years (well actually nearer 60 years)? Well, it will be job security for my great-grandchildren so I'm not going to worry about it."

    I thought we IT guys had learned our lesson after the millenium bug ...

    Does that include the lesson that 'millennium' has two n's? 😛

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Seems to be a lot of re-invent-the-wheel being suggested.

    If you want unique ids (including generating across different machines) and you want random for "security" because you think they are unguessable - then you need ordinary GUIDs. Your random-unguessable requirement is always going to conflict with index performance (if you use this id as a key).

    If you want to avoid the perfromance problems of using GUIDs as keys, and you do your security properly, then use sequential GUIDs which are already built into SQL Server for preciseley this purpose. If you don't want to generate the ids in-database, the same funciton is available on the client, or app layer, in windows api.

    SQL: NEWSEQUENTIALID()

    Client: UuidCreateSequential()

  • Whilst it is true sequential guids solve many of the main issues associated with databases partitioned across multiple servers or failover situations and provide random, hard to guess values, based on experience I would suggest you carefully consider how these would be used.

    Due to time constraints I was forced to use these on a large project and they do work very nicely in code and database however any time a user needs to read back a key things get ugly quickly.

  • Uh, hang on. So what you are saying is different parts of a datetime will represent different parts of the system?

    Why not just create a composite key that does that clearly?

  • One of the mottos that has served me well over the years is "Don't code a solution to an infrastructure problem when an infrastructure solution exists".

    To cope with failover, there are a number of infrastructure solutions including mirroring and log shipping. Trained DBAs are familiar with the processes involved in failover in these scenarios.

    It is not the job of the application/database developer to implement his own failover mechanism, especially in larger shops where standardisation, monitoring, alerting and interchangeability of team members are all issues. Additionally it is the responsibility of DBAs to protect the availability of company data, not that of the developer. Furthermore, the DBAs will be working to a service level that will require one or more of them to be available to deal with failover contingencies - can this be equally said of a developer?

  • I agree with iposner. Database failover redundancies are the DBA's job, not the Developer's.

    There should be no metadata in the data a developer is using to help them figure out which server created the record - if you think you need that type of information, then that should be a tell-tale sign you're already headed down the wrong path.

    I also don't care for some of the examples other are posting of calling another service to get an ID, well what happens when that service goes down (because you have no control over it or those who hit it) - Can we say, fail-whale?

    Then there's the suggestion of creating an ID with letters and numbers in it to represent what type of record it is. You can generate this ID before displaying it to the user. If the user query's for the mnemonic-ID, you can convert it into a query that parses out the mnemonic-ID to query based on record type, numeric ID, and so on.

    It is best to have a sequential numeric ID for performance purposes. It saves on space, and allows for better indexing, which equals better speed.

    There are times when you do need a GUID, but I make it a point to never use them as a Primary Key. Instead, for example, I use them when creating a new login that I need to email to a user for a confirmation link. That GUID is not their User ID, instead it is linked to their User ID. If you don't want GUID's expanding your table row length, then throw them into a separate table (i.e. A table with just UserID and ConfirmationGUID, where UserID is a foreign key to the Primary Key in the User's table).

  • Thanks for a very good thought provoking article.

    Every time I try to think of an alternative to a 16 byte GUID,

    an 8 byte BIGINT (in one form or another) always seems to

    be a reasonable alternative.

    That being said, I believe there is a small typo in the article.

    There are 86400 seconds in a day, not 84400 🙂

    Keep up the good work!

  • I like this solution (and Nick 2535's). it is simple. Doesn't use GUIDs. And has time stamp as high order portion of key.

  • John.E.McGuire (9/15/2011)


    I like this solution (and Nick 2535's). it is simple. Doesn't use GUIDs. And has time stamp as high order portion of key.

    Thanks 🙂

    It's also VERY customisable to give you tighter timing, or less chance of a collision - or not timing if you want. The ability to just get the last 3 digits off a customer for reference - that generally allows you to see their order (as ordered on screen newest first).

    The only place we haven't rolled it out is for invoice numbering as the UK VAT (tax) office don't allow non-sequential numbering.

Viewing 15 posts - 1 through 15 (of 63 total)

You must be logged in to reply to this topic. Login to reply