SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


A Look at GUIDs


A Look at GUIDs

Author
Message
B. Hedge
B. Hedge
Grasshopper
Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)

Group: General Forum Members
Points: 13 Visits: 1
I think that GUIDs are more valuable than most people give them credit for. There is a debate about the performance impact but if used wisely to set up a good live archive system then performance in the primary production DB will improve. I also like to use GUIDs on frequently updated records and rows. Instead of doing record locking I update the GUID after every update... my business layer has the ability to display the differences to the losing update.
jim650313
jim650313
Grasshopper
Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)

Group: General Forum Members
Points: 11 Visits: 36
We use GUID's for an object relational mapping (ORM) layer in our framework. This allows clients to use the ORM for their own use (that we may or may not have knowledge of) to add tables / objects etc. We can confidently add to the ORM with out fear of a collision.
Also GUID's are used (not as a primary key) to provide a unique reference enable collaboration with other software eg CRM, Finance systems. So far it has served us well.
JS



tymberwyld
tymberwyld
SSCommitted
SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)

Group: General Forum Members
Points: 1618 Visits: 275
We've used GUIDs a lot in one implementation. This was a time-tracking system and one of the things we implemented was in the Activities table, the PKey was a concatenation of the User's GUID Key + the Date + a SeqNo (ex. {00171A97-3D6A-436E-8536-A0788420686C}:01032006-0001). Mostly, we used GUIDs at the Parent Entity levels in the DB (Users, Companies, etc) and concatenations at the Child levels.

I've never seen a performance hit from doing this even though the PKey is really a VarChar(70). I understood the risks of page splits, but I think in this case there MIGHT be less page splis because the Rows for each user would be lumped together on the same page(s) (as much as possible). This probably accounts for the reason that there really isn't a performance hit when selecting rows.



Stewart Joslyn
Stewart Joslyn
Ten Centuries
Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)

Group: General Forum Members
Points: 1283 Visits: 188
In this scenario, the GUID is not unique in the table so I would expect exactly the same result if you were using an integer or anything else as the user id - except for any saving from having a shorter index key.

I assume that you mainly are concerned with searching by user - if by date then changing the order of the key components might show up in insert performance.



alexweatherall
alexweatherall
Old Hand
Old Hand (333 reputation)Old Hand (333 reputation)Old Hand (333 reputation)Old Hand (333 reputation)Old Hand (333 reputation)Old Hand (333 reputation)Old Hand (333 reputation)Old Hand (333 reputation)

Group: General Forum Members
Points: 333 Visits: 39
You might not be seeing a performance hit if your looking at events user by user - but it may get slower the longer you run this model (depending on activity rates?). Also, if using this generated key structure, store your dates as YYYYMMDD, then they'll be indexed by User and in date order (other wise you won't get all the events from one year together?!?)

Surely you would be better storing this data in a binary column instead of a varchar column. It would take up 16 (GUID) + 4 (date) + 2 (seqno) = 22 bytes instead of the 50ish bytes taken up by the string. This allows for your required generated key (not questioning your choices of your primary key structure here - thats another topic!), just the data size. This would reduce your PK column size by half, improving index performance and size.

If it's a hidden key, then it doesn't matter that it's binary column (ie not human readable), and it will be more efficient (slightly) to "decode" it as well.

So you could generate the column data like this:

DECLARE @UserUID uniqueidentifier
DECLARE @Date datetime -- or char(8)
DECLARE @SeqNo smallint
DECLARE @GeneratedActivityKey binary(22)

SET @UserUID = NEWID()
SET @Date = GETDATE()
SET @SeqNo = 22

SET @GeneratedActivityKey = CONVERT(CONVERT(binary(16),@UserUID)+CONVERT(binary(4),@Date)+CONVERT(binary(2),@SeqNo))

SELECT @GeneratedActivityKey

will give something like

0x00171A973D6A436E8536A0788420686C00012ABC0012

Only taking up 22 bytes and still giving you your clustered pages (by user,date,seqno)



It would probably be more efficient on selects to use a composite key however.



tymberwyld
tymberwyld
SSCommitted
SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)

Group: General Forum Members
Points: 1618 Visits: 275
Yes, I've always had it in mind to re-think the key. I like your idea and I'll look into it. We don't use the keys for anything really (meaning no one tracks them). I had thought to move the date portion to YYYYMMDD but never got around to it. One thing I was also thinking of doing was to move the Clustered Key from this PKey to some other columns, but I could never really think of a good key for this.

Thanks for the input!



alexweatherall
alexweatherall
Old Hand
Old Hand (333 reputation)Old Hand (333 reputation)Old Hand (333 reputation)Old Hand (333 reputation)Old Hand (333 reputation)Old Hand (333 reputation)Old Hand (333 reputation)Old Hand (333 reputation)

Group: General Forum Members
Points: 333 Visits: 39
Another caveat about using GUIDs as keys:

Beware of using non-sequential GUIDs as a non-clustered PRIMARY KEY on large historical data. If you want to delete a block of data (say all events before '2006-01-01'), even if your clustered index is on your eventdate column, it will have to remove nodes from your non-clustered index for all rows affected by the DELETE statement. If your non-clustered index is on a "random" GUID, then this will hit a high percentage of pages on your index (with large data = very slow!!!). This occurs even if you batch your DELETE (with SET ROWCOUNT and a loop). If the GUID was sequential, then the nodes will be together and the DELETE from index operation would be much faster. If you use the COMB technique described elsewhere in this discussion (I think there was a link earlier) or a sequential key (int, bigint, sequentialguid etc), then this problem is somewhat alleviated.

Or drop and recreate the index, but that could be expensive too!



Andy Warren
Andy Warren
One Orange Chip
One Orange Chip (25K reputation)One Orange Chip (25K reputation)One Orange Chip (25K reputation)One Orange Chip (25K reputation)One Orange Chip (25K reputation)One Orange Chip (25K reputation)One Orange Chip (25K reputation)One Orange Chip (25K reputation)

Group: Moderators
Points: 25637 Visits: 2746

I also received a question about whether both implementations guaranteed unique. I asked Steve to follow up with MS to get a 'good' answer, and Paul Randal from the storage team was kind enough to provide this answer:

"These two intrinsics are thread-safe, and multiple concurrent users will get different UUIDs. http://blogs.msdn.com/sqlprogrammability/archive/2006/03/23/559061.aspx gives more detail on this topic."



Andy
SQLAndy - My Blog!
Connect with me on LinkedIn
Follow me on Twitter
Michael MacGregor
Michael MacGregor
Grasshopper
Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)

Group: General Forum Members
Points: 21 Visits: 42

The database that I have inherited is infested with GUIDs. I should probably at this point inform you that I am not a fan of GUIDs, never have been and probably never will be especially after having to work with this schema.

The original reasoning behind choosing GUIDs as the clustered PKs was reasonably sound, to provide uniqueness to values that are generated at divers locations. However, due to the nature of the system with a high record insertion, the GUIDs are causing performance issues, locking up resources while page splits occur, blocking out waiting processes and ultimately resulting in timeouts.

Sure we could increase the timeout duration, but, to be honest, the wait time involved is unacceptable in our particular system, and not just from our perspective but most especially from our customers who do not appreciate delays.

You mention about hot-spots caused by using identity keys, well I've never yet had any issues with those, and I have been able to dramatically improve performance by eliminating the GUIDs in favour of identity. This on SQL 2005 as well.

However, the article was definitely thought-provoking and I did in fact learn something new, which is always a good thing, and for those of you who like using GUIDs, all power to you, I'm just not about to be a convert.

Mike


Jamie Thomson
Jamie Thomson
SSCrazy
SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)

Group: General Forum Members
Points: 2723 Visits: 188

Hi Mike,

Great post. Its good to have some real-world experiences on here. It kind of confirms what I was already thinking as well.

Do you have any stats about this "dramatic improvement"?

-Jamie



Jamie Thomson
http://sqlblog.com/blogs/jamie_thomson
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search