RE: Design – SQLServerCentral

SSChampion

Points: 12388

September 24, 2010 at 8:34 am

Hugo Kornelis (9/24/2010)
tommyh (9/23/2010)
Have to agree with UMG Developer. The "Multiple identifier columns" shouldnt be part of the answer for 2 reasons.
1. Its not part of MS recommendations (atleast not on the page thats linked in the answer)
2. In the answer you have yourself added
While having multiple identifier columns is not, strictly speaking, against the rules, it makes the table design overly complex, wastes space and adds no value
So your contradicting yourself. Now i do agree that that might not be the best solution but that doesnt make it wrong.
The more I think about this, the more I disagree with the notion that multiple identifier columns is a design flaw.
For example, cars in a table might be idenitified by the VIN. That number is relevant for communications with the supplier and with the official agencies, but employees can't ever remember it, so the license plate number is added as a second identifier column. However, if the table is used in replication, or if it uses FILESTREAM columns, SQL Server requires that a surrogate key with the UNIQUEIDENTIFIER data type is also present in the table. And finally, if there are many references to the cars table, neither the VIN, nor the license number or the UID are a good and space-efficient method for the reference, so it might make sense to add another identifier column, with a numeric data type and the IDENTITY property. (This has the added advantage that inserts won't cause table splits - probaby not relevant in a cars table, but in casese where thousands of rows are added each second, this can be the difference between acceptable performance and utter slowness).
That are, let me count, no less than four identifier columns. Of course all backed by a PRIMARY KEY or UNIQUE constraint. And all relevant. I see no design flaw whatsoever, and I also don't see how this "makes the table design overly complex, wastes space and adds no value".

I agree with you, but I disagree with you (as usual 😛 ). The first two you mention I would not consider identifier columns for the database. They may be business logic identifiers, but I would consider using them as identifiers for database programming to be a serious lapse in judgment -- because they're user-entry and user-editable*, you're simply asking for trouble. Oh, and what about cars that don't have plates yet -- or you don't have the plate info? Might need a null for that column; there goes your UNIQUE constraint.

So that gives us 2 identifier columns, remarkably similar to the two in this QotD -- and that's where you may have a point. But I'd caution here as well. First, we need to know why that surrogate UID is there.

If it's for replication, then you want to be sure that 1) you are only replicating from one source, and the app owner isn't ever going to want to replicate any of the other tables linked via your IDENTITY column, because then you have the chance for collisions. As much as I know you loathe using UIDs as foreign keys, it's often necessary when replication is involved.

If it's for FILESTREAM data, consider whether the FILESTREAM data always has to be accessed with the rest of the data. If not, I'd say it makes more sense to separate the FILESTREAM data to a separate table with the UID as its identifier and a foreign key to the now only identifier in the cars table, the IDENTITY column.

* (What you don't think plate numbers change? As for VINs, are you going to tell Doris in data entry that she has to delete the entire record and start again just because she missed a 9? Good luck with that, buddy!)