Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase «««1234»»

A Look at GUIDs Expand / Collapse
Author
Message
Posted Friday, July 28, 2006 2:31 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Wednesday, February 28, 2007 12:46 PM
Points: 1, Visits: 1
I think that GUIDs are more valuable than most people give them credit for.  There is a debate about the performance impact but if used wisely to set up a good live archive system then performance in the primary production DB will improve.  I also like to use GUIDs on frequently updated records and rows.  Instead of doing record locking I update the GUID after every update...  my business layer has the ability to display the differences to the losing update.   
Post #298191
Posted Sunday, July 30, 2006 11:42 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Wednesday, November 28, 2012 4:22 PM
Points: 1, Visits: 36
We use GUID's for an object relational mapping (ORM) layer in our framework. This allows clients to use the ORM for their own use (that we may or may not have knowledge of) to add tables / objects etc. We can confidently add to the ORM with out fear of a collision.
Also GUID's are used (not as a primary key) to provide a unique reference enable collaboration with other software eg CRM, Finance systems. So far it has served us well.
JS



Post #298285
Posted Monday, July 31, 2006 7:27 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Saturday, February 2, 2013 8:21 AM
Points: 283, Visits: 268
We've used GUIDs a lot in one implementation. This was a time-tracking system and one of the things we implemented was in the Activities table, the PKey was a concatenation of the User's GUID Key + the Date + a SeqNo (ex. {00171A97-3D6A-436E-8536-A0788420686C}:01032006-0001). Mostly, we used GUIDs at the Parent Entity levels in the DB (Users, Companies, etc) and concatenations at the Child levels.

I've never seen a performance hit from doing this even though the PKey is really a VarChar(70). I understood the risks of page splits, but I think in this case there MIGHT be less page splis because the Rows for each user would be lumped together on the same page(s) (as much as possible). This probably accounts for the reason that there really isn't a performance hit when selecting rows.



Post #298352
Posted Monday, July 31, 2006 7:33 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Monday, May 9, 2011 7:49 AM
Points: 343, Visits: 188
In this scenario, the GUID is not unique in the table so I would expect exactly the same result if you were using an integer or anything else as the user id - except for any saving from having a shorter index key.

I assume that you mainly are concerned with searching by user - if by date then changing the order of the key components might show up in insert performance.



Post #298353
Posted Tuesday, August 1, 2006 4:35 PM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Thursday, August 12, 2010 8:45 AM
Points: 293, Visits: 39
You might not be seeing a performance hit if your looking at events user by user - but it may get slower the longer you run this model (depending on activity rates?). Also, if using this generated key structure, store your dates as YYYYMMDD, then they'll be indexed by User and in date order (other wise you won't get all the events from one year together?!?)

Surely you would be better storing this data in a binary column instead of a varchar column. It would take up 16 (GUID) + 4 (date) + 2 (seqno) = 22 bytes instead of the 50ish bytes taken up by the string. This allows for your required generated key (not questioning your choices of your primary key structure here - thats another topic!), just the data size. This would reduce your PK column size by half, improving index performance and size.

If it's a hidden key, then it doesn't matter that it's binary column (ie not human readable), and it will be more efficient (slightly) to "decode" it as well.

So you could generate the column data like this:

DECLARE @UserUID uniqueidentifier
DECLARE @Date datetime -- or char(8)
DECLARE @SeqNo smallint
DECLARE @GeneratedActivityKey binary(22)

SET @UserUID = NEWID()
SET @Date = GETDATE()
SET @SeqNo = 22

SET @GeneratedActivityKey = CONVERT(CONVERT(binary(16),@UserUID)+CONVERT(binary(4),@Date)+CONVERT(binary(2),@SeqNo))

SELECT @GeneratedActivityKey

will give something like

0x00171A973D6A436E8536A0788420686C00012ABC0012

Only taking up 22 bytes and still giving you your clustered pages (by user,date,seqno)



It would probably be more efficient on selects to use a composite key however.



Post #298816
Posted Wednesday, August 2, 2006 6:48 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Saturday, February 2, 2013 8:21 AM
Points: 283, Visits: 268
Yes, I've always had it in mind to re-think the key. I like your idea and I'll look into it. We don't use the keys for anything really (meaning no one tracks them). I had thought to move the date portion to YYYYMMDD but never got around to it. One thing I was also thinking of doing was to move the Clustered Key from this PKey to some other columns, but I could never really think of a good key for this.

Thanks for the input!



Post #298922
Posted Wednesday, August 2, 2006 7:46 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Thursday, August 12, 2010 8:45 AM
Points: 293, Visits: 39
Another caveat about using GUIDs as keys:

Beware of using non-sequential GUIDs as a non-clustered PRIMARY KEY on large historical data. If you want to delete a block of data (say all events before '2006-01-01'), even if your clustered index is on your eventdate column, it will have to remove nodes from your non-clustered index for all rows affected by the DELETE statement. If your non-clustered index is on a "random" GUID, then this will hit a high percentage of pages on your index (with large data = very slow!!!). This occurs even if you batch your DELETE (with SET ROWCOUNT and a loop). If the GUID was sequential, then the nodes will be together and the DELETE from index operation would be much faster. If you use the COMB technique described elsewhere in this discussion (I think there was a link earlier) or a sequential key (int, bigint, sequentialguid etc), then this problem is somewhat alleviated.

Or drop and recreate the index, but that could be expensive too!



Post #298937
Posted Wednesday, August 2, 2006 8:33 AM
SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: Moderators
Last Login: Today @ 12:08 PM
Points: 6,790, Visits: 1,904

I also received a question about whether both implementations guaranteed unique. I asked Steve to follow up with MS to get a 'good' answer, and Paul Randal from the storage team was kind enough to provide this answer:

"These two intrinsics are thread-safe, and multiple concurrent users will get different UUIDs. http://blogs.msdn.com/sqlprogrammability/archive/2006/03/23/559061.aspx gives more detail on this topic."



Andy
SQLAndy - My Blog!
Connect with me on LinkedIn
Follow me on Twitter
Post #298969
Posted Friday, July 27, 2007 8:15 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Tuesday, July 2, 2013 2:32 PM
Points: 7, Visits: 40

The database that I have inherited is infested with GUIDs. I should probably at this point inform you that I am not a fan of GUIDs, never have been and probably never will be especially after having to work with this schema.

The original reasoning behind choosing GUIDs as the clustered PKs was reasonably sound, to provide uniqueness to values that are generated at divers locations. However, due to the nature of the system with a high record insertion, the GUIDs are causing performance issues, locking up resources while page splits occur, blocking out waiting processes and ultimately resulting in timeouts.

Sure we could increase the timeout duration, but, to be honest, the wait time involved is unacceptable in our particular system, and not just from our perspective but most especially from our customers who do not appreciate delays.

You mention about hot-spots caused by using identity keys, well I've never yet had any issues with those, and I have been able to dramatically improve performance by eliminating the GUIDs in favour of identity. This on SQL 2005 as well.

However, the article was definitely thought-provoking and I did in fact learn something new, which is always a good thing, and for those of you who like using GUIDs, all power to you, I'm just not about to be a convert.

Mike

Post #385806
Posted Friday, July 27, 2007 8:23 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Sunday, November 10, 2013 11:52 AM
Points: 877, Visits: 188

Hi Mike,

Great post. Its good to have some real-world experiences on here. It kind of confirms what I was already thinking as well.

 

Do you have any stats about this "dramatic improvement"?

 

-Jamie

 



Jamie Thomson
http://sqlblog.com/blogs/jamie_thomson
Post #385810
« Prev Topic | Next Topic »

Add to briefcase «««1234»»

Permissions Expand / Collapse