SQL Clone
SQLServerCentral is supported by Redgate
Log in  ::  Register  ::  Not logged in

Give Up on Natural Primary Keys

By Steve Jones,

There is plenty of debate over how to design your database. At SQLServerCentral we have a Stairway Series as well as a few articles that cover design topics. I think it's important for anyone that builds tables to spend some time learning what others have done and understand the pros and cons of making different choices. It does often become hard to change designs once they are in use, so trying to choose a good entity design early is important.

One of the things I think is important in modeling your particular entity is including a primary key (PK). In my DevOps talk I stress this, as I'd rather most attendees come away thinking a PK is important as their first takeaway from the session. There are exceptions, but they are rare, and I would prefer that most tables just have some PK included from the beginning.

A PK ought to be stable as well, and there are plenty of written words about how to pick the PK for your particular problem domain. Often I have received the advice that natural keys are preferred over surrogate keys, and it is worth the effort to try and identify a suitable column (or set of columns) that will guarantee uniqueness. I think that's good advice, and it's also advice I tend to ignore.

There's an interesting article about keys and the GDPR. The first part is a rather basic description of what PKs are, but the second part talks about keys and some of the rights that data subjects have under the GDPR. I think these are worth considering, especially as it's likely similar legislation will make its way into other jurisdictions, as already seen in California. The short part of the argument is that the right to be forgotten or to have your data deleted is incompatible with the use of natural keys.

It's an argument, though I'm not completely sure if I think it would be solid. There are valid reasons to keep some information about a user, and I suspect keeping a list of emails to delete from a database restore as a separate list would be a valid use. Even if the user asked that their information was removed. It would be, but there would also be a need to ensure that the correct data was removed,  hence a list of emails.

The bigger problem for me is that if I needed to redact or alter this key data, which I would likely do in order to keep some integrity in my database, I'd need to alter this data in lots of tables. That makes for a much more complex set of scripts, including ensuring that I am correctly building a map of the new values I would use for a key. It's much easier to have a surrogate key that doesn't change and just redact the other information.

I'm sure there are arguments both ways, but as we move towards the era of not only seeing data as valuable, but also as an asset we can't completely control, I think surrogate keys make more sense now than ever. Let me know if you agree.

Total article views: 91 | Views in the last 30 days: 2
Related Articles

Change the Edition or Change the Design?

Today Steve Jones talks about revisiting your design and perhaps thinking about buying another editi...


Think like a DBA

What do your programmers do with nulls? Do they know nullif, isnull, what about the powerful coalesc...


Thinking about goals for 2011

The SQL community is inspiring, and there are a lot of people that work hard to contribute to it. Wi...


What I Love About SQL Server

Well, there are many things I love about SQL Server. Otherwise, I wouldn’t spend my whole career aro...


Database design issue

Database design