I found this post on the subject of choosing a primary key. While Java Persistence Annotations allow us to use any field we want as a primary key (as long as it is naturally unique), is there a good reason to use anything that is not a surrogate/artificial primary key?
There are plenty of fine examples for natural primary keys: SKUs, usernames, email addresses, and so on. While these may work fine as a primary key insofar as they satisfy uniqueness requirements, there are some drawbacks, the biggest being the fact that uniqueness may not be guaranteed.
This post lists the reasons against using natural primary keys with 10 very good points:
- Con 1: Primary key size – Surrogate keys generally don’t have problems with index size since they’re usually a single column of type int. That’s about as small as it gets.
- Con 2: Foreign key size – They don’t have foreign key or foreign index size problems either for the same reason as Con 1.
- Con 3: Asthetics – Well, it’s an eye of the beholder type thing, but they certainly don’t involve writing as much code as with compound natural keys.
- Con 4 & 5: Optionality & Applicability – Surrogate keys have no problems with people or things not wanting to or not being able to provide the data.
- Con 6: Uniqueness – They are 100% guaranteed to be unique. That’s a relief.
- Con 7: Privacy – They have no privacy concerns should an unscrupulous person obtain them.
- Con 8: Accidental Denormalization – You can’t accidentally denormalize non-business data.
- Con 9: Cascading Updates – Surrogate keys don’t change, so no worries about how to cascade them on update.
- Con 10: Varchar join speed – They’re generally integers, so they’re generally as fast to join over as you can get.
So while on the surface it may seem simple to use a seemingly unique field for a primary key (a username on a domain, for example), it can be disastrous later on. Con 6 above is the big one, but Con 7 is something people don’t seem to think about as much. We can enforce uniqueness on any field we want, be it a key field or not… That said, I really cannot think of a good reason to use a natural key (other than developer laziness, which is in fact the key reason why bad code tends to be written in the first place).
[Stack Overflow: Deciding between an artificial primary key and a natural key for a Products table]
[Rapid Application Development: Surrogate vs Natural Primary Keys – Data Modeling Mistake 2 of 10]
[Wikipedia: Surrogate Key]
[Wikipedia: Natural Key]