What is the best way to bulk delete rows in JPA while also cascading the delete to child records
The options are:
- use the cascade.Remove setting on the mapping, loading entities and calling em.remove on each
- Use bulk delete on the main entity and have the "ON DELETE CASCADE" database option set so that the database will cascade the delete for you. EclipseLink has a @CascadeOnDelete annotation that lets it know the "ON DELETE CASCADE" is set on a relationship, or to create it if using JPA for DDL generation: http://eclipse.org/eclipselink/documentation/2.5/jpa/extensions/a_cascadeondelete.htm
- Use multiple bulk deletes to remove children that might be referenced before removing the main entity. For example: "Delete FROM Child c where c.parent = (select p from Parent P where [delete-conditions])" and "Delete FROM Parent p where [delete-conditions]" See section 10.2.4 of http://docs.oracle.com/middleware/1212/toplink/OTLCG/queries.htm#OTLCG94370 for details.
How does the JPA CriteriaDelete work
A JPA CriteriaDelete
statement generates a JPQL bulk delete statement, that's parsed to an SQL bulk delete statement.
So, the following JPA CriteriaDelete
statement:
CriteriaBuilder builder = entityManager.getCriteriaBuilder();
CriteriaDelete<PostComment> delete = builder.createCriteriaDelete(PostComment.class);
Root<T> root = delete.from(PostComment.class);
int daysValidityThreshold = 3;
delete.where(
builder.and(
builder.equal(
root.get("status"),
PostStatus.SPAM
),
builder.lessThanOrEqualTo(
root.get("updatedOn"),
Timestamp.valueOf(
LocalDateTime
.now()
.minusDays(daysValidityThreshold)
)
)
)
);
int deleteCount = entityManager.createQuery(delete).executeUpdate();
generates this SQL delete query:
DELETE FROM
post_comment
WHERE
status = 2 AND
updated_on <= '2020-08-06 10:50:43.115'
So, there is no entity-level cascade since the delete is done using the SQL statement, not via the EntityManager
.
Bulk Delete Cascading
To enable cascading when executing bulk delete, you need to use DDL-level cascade when declaring the FK constraints.
ALTER TABLE post_comment
ADD CONSTRAINT FK_POST_COMMENT_POST_ID
FOREIGN KEY (post_id) REFERENCES post
ON DELETE CASCADE
Now, when executing the following bulk delete statement:
DELETE FROM
post
WHERE
status = 2 AND
updated_on <= '2020-08-02 10:50:43.109'
The DB will delete the post_comment
records referencing the post
rows that got deleted.
The best way to execute DDL is via an automatic schema migration tool, like Flyway, so the Foreign Key definition should reside in a migration script.
If you are generating the migration scripts using the HBM2DLL tool, then, in the PostComment
class, you can use the following mapping to generate the aforementioned DDL statement:
@ManyToOne(fetch = FetchType.LAZY)
@JoinColumn(foreignKey = @ForeignKey(name = "FK_POST_COMMENT_POST_ID"))
@OnDelete(action = OnDeleteAction.CASCADE)
private Post post;
If you really care about the time it takes to perform this bulk delete, I suggest you use JPQL to delete your entities. When you issue a DELETE
JPQL query, it will directly issue a delete on those entities without retrieving them in the first place.
int deletedCount = entityManager.createQuery("DELETE FROM Country").executeUpdate();
You can even do conditional deletes based on some parameters on those entities using Query API like below
Query query = entityManager.createQuery("DELETE FROM Country c
WHERE c.population < :p");
int deletedCount = query.setParameter(p, 100000).executeUpdate();
executeUpdate
will return the number of deleted rows once the operation is complete.
If you've proper cascading type in place in your entities like CascadeType.ALL
(or) CascadeType.REMOVE
, then the above query will do the trick for you.
@Entity
class Employee {
@OneToOne(cascade=CascadeType.REMOVE)
private Address address;
}
For more details, have a look at this and this.