Hibernate: All Entities Must Have an Id–And All Tables Should Be Entities!

Sometimes the relational database designer in me sometimes wants to create a table with multiple columns without a primary key (i.e., only a foreign key lookup . An example of when this might be appropriate is when some parent data type had many attributes that don’t necessarily need to be persisted as unique entities (any entity being any persisted POJO) on their own. Think of a “customer” table in a database. A customer may have many phone numbers, so a table relationship like this may seem to make sense:

customer
 id
 name
 address

cust_phone
 phone_number
 number_type
 customer_id

In the above example the cust_phone field has no primary key. It has a customer_id field, but this is a foreign key to the customer table. This seems to be a perfectly fine relational database design, but when it comes to Java persistence annotations we run into a few problems. We could go with a join table, but that isn’t really necessary here (we don’t have a many-to-many relationship).

Most importantly, if we want to do an insert or update on cust_phone, we’re going to need to have an annotated class to do it on, and we’re going to have to access that class with an Id (perhaps a composite Id in the case of a join table, but, again, there is little overhead in having a unique artificial primary key on a join or lookup table, and it makes the life of the Java developer easier). When a join table is implemented on the Java side as a POJO with its own Id, we can handle bidirectional updates using those POJOs. If we don’t have this ability we are stuck with navigating a collection of some sort and updating it all at once (at persistence time).

The above can be done using native SQL, but why bother (it would be complicated anyway)? We could also use the parent’s primary key as the child’s primary key, but that isn’t good database design. Ultimately, what makes the most sense is to accept that all Java persisted entities must have a unique @Id. There is little overhead and this prevents a number of problems at insert and update time.

To put it another way, we can think of something like this:

A student has many courses. A course has many students. Between these two entities we may have a student_course table like so:

STUDENT ID | COURSE ID
505        |        11
505        |        12
505        |        13
506        |        12
507        |        15

From a pure relational database point of view, this is fine. Heck, its common!

But to make life easier in the annotated class with this bidirectional relationship, let’s add am artificial primary key to this join table, like this:

ID | STUDENT ID | COURSE ID 
 1 |        505 |        11
 2 |        505 |        12
 3 |        505 |        13
 4 |        506 |        12
 5 |        507 |        15

Now instead of traversing a Hashmap or some other collection of student to course relationships inside of my student POJO, I can navigate each relationship as its own persisted entity class. I can even change an individual student/course enrollment. Although from a pure database point of view, this may be bad practice–we probably instead want some other column, such as “ACTIVE” or “DROPPED” or “REMOVED”. Herein lies yet an additional benefit: We have more details about the relationship. Not only do we know that the relationship exists, but we know some specifics about it, and we can access these specifics through a nice Java class with all sorts of helpful methods. The table above doesn’t have the appearance of a traditional join/lookup table in a relational database, sure, but we’re thinking in terms of database design as well as Java persistence annotations and Hibernate needs.

Here’s another major benefit: As long as my getters and setters account for the bidirectional nature of the object insertions (and they should!), I can add a student to a class or add a class to a list of student enrollments very easily in my Java code:

Class myClass = <some class>;
Student myStudent = <some student>;
myStudent.addClass(myStudent);

Or I can do it the other way:

Class myClass = <some class>;
Student myStudent = <some student>;
myyClass.addStudent(myStudent);

Sure, I could have done the exact same thing with a set of classes using @JoinTable, but when it comes time to manage insertion, modification and remove we end up writing more complicated code and potentially updating more on the database (i.e. we have more transactions than necessary) on the database side. Anther benefit is the ability to query against the join table (in this case student_class) and get a list of unique objects (StudentClassEnrollment), the type of which we know to be a specific POJO.

Now I see the obvious issue above: We have introduced the potential for enrolling a single student into a class twice. This is easy enough to take care of with unique constraints on multiple columns (student_id and class_id) as well as on the DAO/Insert side (in any case, we would want to check for such potential unique constraint violations at the front-end and the DAO end).

Of course, it is very important to remember to create appropriate getter and setter methods so that both sides of this bi-directional relationship (@ManyToOne and @OneToMany) have the necessary key fields. Hibernate will not magically handle this step. The big “duh” moment is when you think about how on earth Hibernate would handle an update on a table with no @Id field on the POJO. It couldn’t, because there is every reason to expect duplicate data on the columns outside of the FK (assuming the FK is not used as a unique ID). Adding to the confusion, there is always the chance (even if you think you have designed against it) that the PK->FK relationship becomes broken, and we are left with orphaned data.

What this all boils down to is designing a database with simplicity of JPA and Hibernate concerns in mind.

[JavaRanch: Hibernate Mapping with no primary key in existing DB]
[Java Persistence: Many To One]

[Thoughts On: Hibernate Many-To-Many Revisited] — They guy has some other thoughts about how to achieve many-to-many mappings and handle for PK. He somewhat agrees with me, but offers a hybrid approach.

Advertisements