First, let’s cover some facts that every experienced developer should know. Not every technology is suitable for every project and no matter how advanced and automated the technology is, we should know the basics. In other words, if something is slow and unreliable, it's likely up to the person who chose that particular technology. Also, ORM is not a substitute for SQL – you need to know SQL and have basic database knowledge.
My enthusiasm for ORM started a few years ago with JDBC, after a bunch of SQL inside the code. I still have some PTSD because of it. It was quite difficult to maintain and easy to create chaos. As always, sooner or later we realize that it isn’t all it’s cracked up to be and you get a headache. In this case, it wasn't the Friday afterwork party that caused it, but the Hibernate party. Someone without enough experience and knowledge had put Hibernate in the enterprise application. I have to admit that it was terribly slow and quite difficult to maintain.
The main purpose of ORM is to act as a bridge between the OOP language used and a relational database. The problem is that relational databases don't really support OOP. For example, multiple classes in code can be represented by a single database table. Furthermore, the relational model does not support anything like inheritance, and as you probably know, we experienced Java developers would probably commit seppuku without classes and objects. It’s commonly known that identity is essential. OOP languages generally support various methods for object comparison, while in relational databases this is done through the primary key. The connections between entities in the relational database are established via foreign key, while in an OOP language it is necessary to create a relation in each class. And finally, how do you navigate objects effectively? The system administration team responsible for database maintenance will not be happy if you execute a query every time you need a single entity. Neither beer nor kebabs will help you please them.
This whole intro was necessary to get to the main point of the article, which is entity inheritance. That's an important part because if you make a mistake, change is difficult, expensive, and sometimes impossible. Mapping the database into classes is important because it determines how the data will be stored, retrieved, validated, what resources will be needed and much more.
JPA supports the following inheritance models:
- SINGLE_TABLE – One table per class hierarchy. In terms of performance and easy implementation, this is the best strategy. The downside is that all properties from the subclass must be nullable. Instead of listing those standard examples from the Internet, here's one from the real world and a real project. You are a part of the core team and you create a "Task" table that maps to the "ClTask" class. This entity will be used repeatedly by the other teams and by individual teams. How to solve this? It’s simple – everything will be stored in a single table. For Hibernate to be able to differentiate between entities and map them properly into classes, we will create a new DTYPE column with the name of the java class using the @DiscriminatorColumn annotation.
- TABLE_PER_CONCRETE_CLASS – This means that you will simply repaint the database into classes without any chemistry. It also means that if you need more relationships, your application will as slow as a snail. This is because Hibernate will retrieve all related entities to execute a bunch of queries, which will in turn slow down the whole application. Users, systems, and eventually your boss will be very happy. The advantage of this model is that you can define constraints by each property.
- JOINED - This strategy is quite similar to the SINGLE_TABLE strategy except that we do not have a discriminator here but join the @PrimaryKeyJoinColumn annotation. When we do this, only properties that are attached to each individual class appear in the hierarchy of classes, and the only redundant data is the id through which we bind. The main advantage is that we can set a constraint on subclass properties. The main downside is that if we have a large amount of data, the process could be quite slow because joins are very expensive.
Finally, the question is which strategy to choose?
There is no easy answer here. Basically, you can use all three strategies on the same project, and it all depends on the business logic and the application itself. For master data and codebooks generally, a great strategy is TABLE_PER_CONCRETE_CLASS. The question is when to use SINGLE_TABLE and when to use JOINED. You will need to evaluate this yourself. It mostly depends on whether we are focused on the speed or the ability to place constraints on the tables.