A database with incorrect or outdated data is of no use to anyone, especially if you don’t have auditing and entity revisioning set up. To keep track of changes, it’s recommended that each change be recorded and a historical log kept. This then keeps the database up-to-date and error-free.
I’ve found the most luck doing this by utilizing Spring Boot 2.7.8 with Spring data JPA. In this blog post, I will demonstrate step by step how I accomplished the task at hand.
Getting started with change tracking
The process of collecting changes to data within a database is called Change Data Capture (CDC). Through it, you can identify and capture changes, then forward them to a downstream process or system in real time. There are a couple of methods to build CDC. Hibernate Envers is one, and it uses an application-based trigger to track data changes.
Java version: 17
Hibernate Envers version: 5.6.14.Final
Database-migration Tool: Flyway
First things first
To get started with Hibernate Envers, first we enable JPA auditing, because the auditing infrastructure needs to become aware of the application’s current auditor. To do so, we must configure an auditor aware to provide the infrastructure the current auditor by implementing the AuditorAware<T> interface from Spring data. <T> is the type of the auditing instance, but it can be any type.
In our example, we set Medium for the current user. You can also take a look at the examples to get the logged-in user’s ID from Spring security.
After implementing an auditor aware, we now need to create another configuration class for persistence, because we need to provide our implementation to JPA.
Keep in mind that the reference of the auditor aware implementation that we created must match with the name of the bean we create.
Last but not least, we need to create columns in the table to store who created or modified the entity as well as the time of creation and modification. There are 4 columns/fields that we need to create:
- CreatedBy: Declares a field as the one representing the principal that created the entity containing the field;
- CreatedDate: Declares a field as the one representing the date the entity containing the field was created;
- LastModifiedBy:Declares a field as the one representing the principal that recently modified the entity containing the field;
- LastModifiedDate:Declares a field as the one representing the date the entity containing the field was recently modified.
The CreatedBy and CreatedDate fields will be set only when the record/entity is created, and they will not be updated each time the record gets updated, while LastModifiedBy and LastModifiedDate fields will be updated each time the record/entity gets updated.
With that in mind, let’s create an abstract class that contains all the fields we want for auditing.
I recommend having an abstract class with EntityListener for auditing and marked as @MappedSuperclass so it will be designated as a class whose mapping information is applied to the entities that inherit from it. Keep in mind that a mapped superclass has no separate table defined for it.
If you don’t wish to have an abstract class, you can put all those fields in your entity. Make sure you also call @EntityListeners(AuditingEntityListener.class) on the class level.
SQL script (PostgreSQL):
SQL script (PostgreSQL) for the existing table:
Full script used in our example:
If everything goes well, we will be able to see a record with the values of the audits.
Congrats! You can now give your QA a decent answer when they ask “Who edited my records?”
Hibernate Envers to the rescue
If keeping track of who created or changed an entity is not enough, let me introduce you Hibernate Envers.
Envers is a core module of the Hibernate model. It will keep track of all changes on the persisted entities. We can even keep the changes of a removed entity!
Enabling Envers requires more effort, so to start off, you need to install the dependency.
If you wish to use another version, you can check out the mvnrepository. Keep in mind that if you’re using Spring Boot version 3+, you should use Hibernate 6. Otherwise, stick with Hibernate 5. Using Hibernate 5 with Spring Boot version 3+ or vice versa will break the app, because all of the package and configuration parameter names were renamed from javax.persistence.* to jakarta.persistence.* for both Spring Boot and Hibernate.
Next, we need to make a couple of configurations.
Setting an Envers strategy
audit_strategy — ValidityAuditStrategy stores both the start revision and the end revision. It helps us track the revision of next changes.
audit_strategy_validity_store_revend_timestamp — true means that the timestamp of the end revision gets stored. This is usually when the entity is changed.
store_data_at_delete — true means the entity data to be stored in the revision when the entity gets deleted. If you want to keep changes of the removed entity, you must set this property to true. It is false by default.
You can find more configuration properties such as changing suffixes, prefixes, and certain field names here.
Enabling where to use Envers
The one and only annotation we need to use to apply Envers to a class or field is @Audited. When applied to a class, it indicates that all of its properties should be audited. When applied to a field, it indicates that this field should be audited.
When used on a class level, all fields will be audited. So, make sure those fields exist in the *_aud table which we are about to create. If your entity has a child collection, it should have an _aud table too since it is a part of the parent entity. Also, keep in mind that if you use an abstract class to keep auditing fields, setting the auditing annotation on the class level does not cover the fields in the abstract class. You should still set the audited annotation to the fields or the class itself for abstract class.
Creating tables for revision and audits
Hibernate expects to find revinfo and *_aud tables in your database. Otherwise, the Envers will fail at startup. We must make sure the tables exist.
Table validation error:
[PersistenceUnit: default] Unable to build Hibernate SessionFactory; nested exception is org.hibernate.tool.schema.spi.SchemaManagementException: Schema-validation: missing table [revinfo]
If you’re using a data-migration tool, I highly recommend you validate hibernate. To do so, set spring.jpa.hibernate.ddl-auto to validate in application.properties
“The more validation I need, the less discernment I have.” — Kurt Hanks
The app will stop at startup if there is a mismatch between table names/columns and entity table names/fields. The best example is the one that we mentioned above (the validation error). Another example could be if you change the audit table suffix and forget to update your table name in the database, the app will crash and show the following error:
Caused by: org.hibernate.tool.schema.spi.SchemaManagementException: Schema-validation: missing table [users_my_lovely_suffix]
SQL scripts for revision and audits
In our example, we only have the users table. We enabled auditing on a class level, and last_modified_by and last_modified_date in the abstract class. Let’s create some tables.
You may ask why we’re creating a sequence with the name hibernate_sequence. Hibernate Envers expects a global sequence called hibernate_sequence to be inserted into the revinfo table. If such a sequence does not exist, the following error will be shown at startup — Schema-validation: missing sequence [hibernate_sequence].
Here is when the hibernate_sequence is used:
revinfo is a reserved table name by Hibernate for Envers. It stores the revision number and its timestamp (epoch). If such a table does not exist, Hibernate will throw out an error message: Schema-validation: missing table [revinfo].
rev represents the version number while revtstmp represents the epoch timestamp of the revision.
Each *_aud table must have rev, revend, revtype, and revend_tstmp. All these column names and table name(s), prefixes and suffixes are reserved by Hibernate. If you wish to change any of them, check out the Envers configuration.
In each *_aud table, rev is the revision number, revend is the revision number of the next revision after the entity gets updated, revend_tstmp is the timestamp of the next revision, and revtype is the revision type. There are 3 revision types.
- 0 means an entity is created
- 1 means an entity is updated
- 2 means an entity is removed
Once the desired tables are created, we’re ready to create some entities.
Creating an entity:
After creation, we will be able to see Hibernate actions if show-sql is enabled — spring.jpa.show-sql=true in application.properties.
This is what the users_aud table looks like for the created entity.
Keep in mind that the revtype is 0, meaning that the snapshot of this entity is insert.
Let’s update our entity to see what is going to happen.
This is how the users_aud table looks like after updating an entity.
Did you notice a change? revtype is now 1, meaning that the snapshot of this entity is “update”. The previous revision now has revend and revend_tstmp values. These indicate the revision number and the time of the next record for the updated entity. How cool is that!
Now, let’s delete our entity to see what’s going to happen.
This is how the users_aud table looks like after deleting an entity.
Notice the 2nd row whose revtype is 2.
You may ask what those numbers for revtype are. Those are the ordinal of the RevisionType enum that encapsulates the following state modifications:
ADD (0): Indicates that the entity was added (persisted) at that revision.
MOD (1): Indicates that the entity was modified (one or more of its fields) at that revision.
DEL (2): Indicates that the entity was deleted (removed) at that revision.
Getting an entity snapshot
With the help of AuditReaderFactory, we can query the entity snapshots, a.k.a history of a record.
If the *_aud table has id column that keeps the id of the entity, we can safely use AuditEntity.id() to point to the ID column. Otherwise, AuditEntity.property(“your-prop-name”) can be used. .forRevisionsOfEntity() takes three parameters.
- c— The class of entities for which to query.
- selectEntitiesOnly— if true, instead of a list of three-element arrays, a list of entities will be returned as a result. If you want to get a list of the history of a record, set it to true. Setting it false will return a three-element array as shown below.
- selectDeletedEntities— if true, the revisions where entities were deleted will be returned.
I recommend creating an audit repository so that you can call any entity revision anywhere:
There are many options for creating a query. You can check them out here.
If you use getRevisions to get a history of a user by their email address, you can call it auditRepository.getRevisions(UserEntity.class, “email”, “email@example.com”);, and the result will be a list as shown below.
Seeding existing data
If you implemented Envers for the tables that already have a record and the audit strategy was set to ValidityAuditStrategy, you may get errors when Envers updates the revend column — Cannot update previous revision for entity com.example.demo.entity.UserEntity_AUD.Don’t worry! For the existing records, we can write a script to seed the audit tables.
Here is a sample script for our users_aud table.
With this script, we make sure that the seeded data represents the current insert (ADD) snapshot of all the audited data prior to future update (MOD) changes being applied.
Congrats! You’ve now created a reliable system for tracking change data. Your team will be able to monitor data changes real-time and make amendments when needed. No matter how much data gets updated, it will be recorded and audited.
Interested in digging deeper? You can clone the GitHub repository, and debug the sample project to see how it works.