Are There That Many GUIDs?

  • SanDroid wrote:

    How could I have been so stupid to think we were talking about SQL on this web site. My apologies.

    Every IT professional, I think, ought to constantly ask questions about the legitimate business uses, costs and benefits of technology. It's not all about SQL, all the time.

    Apology accepted.

  • Robert Frasca (10/12/2010)


    Craig-315134 (10/12/2010)


    What I fail to see are other business applications where a guid is neccessary for data integrity.

    Perhaps if you want to make a global parcel tracking system? If all companies used a UUID as their tracking number, then they could "easily" upload tracking status to your system.

  • Robert Frasca wrote:

    ... truly distributed computing environments are pretty rare, at least in the context of ithe world wide installed base of SQL Server instances.

    I'm using a distributed computing environment right now to reply to you. It's called the Internet, of course. Recent developments in client-side Internet computing - I'm thinking Ajax, Rest and Soap - make possible, and even preferable, the provisioning of UUIDs without need of a centralised server. So I believe your point is correct, and will likely become even more so as clients take up the UUID provisioning workload.

    I guess I should have been more succinct. I never meant to imply that there is no use for GUID's. Obviously, distributed computing is a legitimate use case.

    Thank you. And yes, it's legitimate. I work with a large distributed system using client-side UUID provisioning. Centralised provisioning would have worked, but was not nearly as clean or efficient a design.

    What I fail to see are other business applications where a guid is neccessary for data integrity. The fact that there is a datatype called uniqueidentifier doesn't make it appropriate or necessary to use it. Microsoft, in BOL, appears to actively discourage it's use for run-of-the-mill data implementations but it remains a popular choice despite it's inherent shortcomings.

    Point well taken, although I would make the same observation as some others here that Microsoft has long used GUIDs in its MS Dynamics/CRM application. Not to say Microsoft never, ever makes mistakes ...

  • As for global uniqueness, I don't see why the column combination (center_id + cashregister_id + transaction_date) or database id + transaction identity id would not be as reliable, or more reliable, than a GUID. A GUID is nothing but a timestamp + (network card id or random number).

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • People often assume you need to use GUIDs for unique values across databases and for replication. However, there is an easy way to get around that using an int or other suitable IDENTITY data type: vary the seed and the increment. If you have two databases, one database can start at 1 with an increment of 2, the other can start at 2 and increment by 2. No collisions. Or have one start with a seed of 1 increment 1 and the other at 1 billion - or some other large number that realistically your database will never hit. Or if you have created the true killer app then have one use negative (1/2 of the available values in signed variable) values, e.g.,

    CREATE TABLE [dbo].[Minus1](

    [IdMinus] [int] IDENTITY(-1,-1) NOT NULL,

    [Test] [nchar](10) NOT NULL,

    CONSTRAINT [PK_Minus1] PRIMARY KEY CLUSTERED

    (

    [IdMinus] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    Different variations on this theme can be used to support more than 2 databases.

  • e3h4 (10/12/2010)


    People often assume you need to use GUIDs for unique values across databases and for replication. However, there is an easy way to get around that using an int or other suitable IDENTITY data type: vary the seed and the increment.

    (Snip perfectly valid explaination)

    There's only one problem with this method, e3h4. The only one of these patterns that allows for new distribution nodes to be brought into the fold is the direct seed increment, and even then you'll need to be on the ball for (rediculously) large systems that might hit the end of their seed allotment.

    GUIDs, on the other hand, have no node # restrictions, can integrate without upper seed bounds, and with less maintenance. They are a convenience, of course, since you explain a perfectly valid workaround. They have their place, however, for ease of use.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • e3h4 (10/12/2010)


    People often assume you need to use GUIDs for unique values across databases and for replication. However, there is an easy way to get around that using an int or other suitable IDENTITY data type: vary the seed and the increment. If you have two databases, one database can start at 1 with an increment of 2, the other can start at 2 and increment by 2. No collisions. Or have one start with a seed of 1 increment 1 and the other at 1 billion - or some other large number that realistically your database will never hit.

    Your solution is good and what is recomended for most replication models using Database publishing in SQL 2005 and SQL 2008. If you add a dbtimestamp column for row versioning it will support all replication models and even allow subscibers and publishers to recover from a backup instead of from having to be re-initialized from a new snap-shot. Yours is one of the few posts today about GUID's and database replication that actually agrees with the supported and recomended methods in the Books Online.

  • All of your comments are great! I would like to add one that I don't believe has been mentioned. Suppose a business has a database application that is sold to several businesses (say 4000). Now suppose that some of those clients decided to merge their businesses or some sort of take over. Now the business has two or more of your databases and requests that you merge them into one database so they only have to paid one licence fee. A resonable request.

    There is the rub and most people forget to anticipate until much too late. Now let's say you have a couple hundred tables in your database and there are a few common Keys and loads of foreign keys. You now have to sync the Identities. A GUID or UNIQUE key would greatly reduce the time and effort in merging these databases.

    This is the business case for GUIDs that I believe make the use appropriate.

    Last year I spent three months on such a case. That work changed the way I look at GUIDs forever. I have greater respect for them now.

  • Hyperdata (10/12/2010)


    All of your comments are great! I would like to add one that I don't believe has been mentioned. Suppose a business has a database application that is sold to several businesses (say 4000). Now suppose that some of those clients decided to merge their businesses or some sort of take over. Now the business has two or more of your databases and requests that you merge them into one database so they only have to paid one licence fee. A resonable request.

    There is the rub and most people forget to anticipate until much too late. Now let's say you have a couple hundred tables in your database and there are a few common Keys and loads of foreign keys. You now have to sync the Identities. A GUID or UNIQUE key would greatly reduce the time and effort in merging these databases.

    This is the business case for GUIDs that I believe make the use appropriate.

    Last year I spent three months on such a case. That work changed the way I look at GUIDs forever. I have greater respect for them now.

    A definite case, but I'm not sure I'd design a database with the idea that my company would merge with another. I don't think it happens often enough to worry about.

  • Even if your company did merge relatively regularly (speaking from experience here...), there would be much more complicated/delicate/time consuming things to deal with than merging database tables.

    Databases take a bit of a behind the scenes role here, which provides plenty of time for merging of tables. More often than not, a new primary key is created along with new dimensions for each heritage.

    So I agree, potential company merges are definitely not a reason to use GUIDs as the Primary Key in your tables. Although they may play a temporary part in the data merge.

  • Hyperdata (10/12/2010)


    All of your comments are great! I would like to add one that I don't believe has been mentioned. Suppose a business has a database application that is sold to several businesses (say 4000). Now suppose that some of those clients decided to merge their businesses or some sort of take over. Now the business has two or more of your databases and requests that you merge them into one database so they only have to paid one licence fee. A resonable request.

    There is the rub and most people forget to anticipate until much too late. Now let's say you have a couple hundred tables in your database and there are a few common Keys and loads of foreign keys. You now have to sync the Identities. A GUID or UNIQUE key would greatly reduce the time and effort in merging these databases.

    This is the business case for GUIDs that I believe make the use appropriate.

    Last year I spent three months on such a case. That work changed the way I look at GUIDs forever. I have greater respect for them now.

    I see your point and I think it's a valid one; however, I've been involved with that kind of integration a dozen or more times. In each case, we simply assigned a range of identities for each source. No fuss, no muss. It's a one time exercise after all and provides the added bonus of being able to tell, at-a-glance, which company the original row came from. Obviously, there's a ripple down effect for foreign keys but it's pretty straight forward.

    Unfortunately, rarely are integration projects quite that cut and dried since many businesses customize their version of that database application, the ETL process is rarely simple.

    If I'm the data architect on the team that developed the original database application sold to these companies I'm not sure I would consider a one-time future integration of multiple systems a legitimate enough use case to offset the inherent overhead of GUIDs.

    You do make an excellent point in that I think a data architect should attempt to anticipate potential integration issues downstream when creating a data model.

    "Beliefs" get in the way of learning.

  • Hyperdata (10/12/2010)


    This is the business case for GUIDs that I believe make the use appropriate.

    Last year I spent three months on such a case. That work changed the way I look at GUIDs forever. I have greater respect for them now.

    I have been involved in at least five Lawsen and/or JDE migrations and merges in the last ten years. Sometimes both at once or one to each other. In every case the main issues were that the system we were migrating to had the same UI friendly Client Identifier i.e. Pfizer or Maersk and that was what cause issues becuase we had to keep seperate what was done for the old companies for those client by creating new UI friendly Client Identifiers for the new business and keep the old business seperate.

    None of this would have been easier by using a UUID or ROWGUID anywhere in either system.

  • e3h4 (10/12/2010)


    People often assume you need to use GUIDs for unique values across databases and for replication. However, there is an easy way to get around that using an int or other suitable IDENTITY data type:

    Um, not going to work for merge replication or transaction replication with immediate updating subscribers, though...

    BOL:

    "A column named rowguid is added to each published table, unless the table already has a column of data type uniqueidentifier with the ROWGUIDCOL property set (in which case this column is used). The rowguid column is used to uniquely identify every row in every published table. If the table is dropped from the publication, the rowguid column is removed; if an existing column was used for tracking, the column is not removed."

    Even if you don't add a GUID to a table in merge replication, SQL will do it for you. So yes, at least for merge replication, it is an absolute requirement to have a GUID.

  • GUIDs can work, but they aren't the answer. Nor are they likely the problem.

    I'm sure we can all find edge cases where they work or don't work, but the point I was trying to make is that I don't recommend them, and I don't think that for the majority of cases they are appropriate. In some cases, they are, but those are a minority of applications.

  • TheSQLGuru (10/12/2010)


    <SmartA$$ ON>I LOVE it when clients use GUIDs in their database schemas - more tuning work for me!! 😀 <SmartA$$ OFF>

    GUID: Generously Useful Income Device

    Just like CASE tools were:

    CASE: Computer-Aided Salary Enhancement

Viewing 15 posts - 46 through 60 (of 169 total)

You must be logged in to reply to this topic. Login to reply