How to do bulk (multi row) inserts with JpaRepository?
To get a bulk insert with Spring Boot and Spring Data JPA you need only two things:
set the option
spring.jpa.properties.hibernate.jdbc.batch_size
to appropriate value you need (for example: 20).use
saveAll()
method of your repo with the list of entities prepared for inserting.
Working example is here.
Regarding the transformation of the insert statement into something like this:
INSERT INTO table VALUES (1, 2), (3, 4), (5, 6)
the such is available in PostgreSQL: you can set the option reWriteBatchedInserts
to true in jdbc connection string:
jdbc:postgresql://localhost:5432/db?reWriteBatchedInserts=true
then jdbc driver will do this transformation.
Additional info about batching you can find here.
UPDATED
Demo project in Kotlin: sb-kotlin-batch-insert-demo
UPDATED
Hibernate disables insert batching at the JDBC level transparently if you use an
IDENTITY
identifier generator.
You can configure Hibernate to do bulk DML. Have a look at Spring Data JPA - concurrent Bulk inserts/updates. I think section 2 of the answer could solve your problem:
Enable the batching of DML statements Enabling the batching support would result in less number of round trips to the database to insert/update the same number of records.
Quoting from batch INSERT and UPDATE statements:
hibernate.jdbc.batch_size = 50
hibernate.order_inserts = true
hibernate.order_updates = true
hibernate.jdbc.batch_versioned_data = true
UPDATE: You have to set the hibernate properties differently in your application.properties
file. They are under the namespace: spring.jpa.properties.*
. An example could look like the following:
spring.jpa.properties.hibernate.jdbc.batch_size = 50
spring.jpa.properties.hibernate.order_inserts = true
....
The underlying issues is the following code in SimpleJpaRepository:
@Transactional
public <S extends T> S save(S entity) {
if (entityInformation.isNew(entity)) {
em.persist(entity);
return entity;
} else {
return em.merge(entity);
}
}
In addition to the batch size property settings, you have to make sure that the class SimpleJpaRepository calls persist and not merge. There are a few approaches to resolve this: use an @Id
generator that does not query sequence, like
@Id
@GeneratedValue(generator = "uuid2")
@GenericGenerator(name = "uuid2", strategy = "uuid2")
var id: Long
Or forcing the persistence to treat the records as new by having your entity implement Persistable and overriding the isNew()
call
@Entity
class Thing implements Pesistable<Long> {
var value: Int,
@Id
@GeneratedValue
var id: Long = -1
@Transient
private boolean isNew = true;
@PostPersist
@PostLoad
void markNotNew() {
this.isNew = false;
}
@Override
boolean isNew() {
return isNew;
}
}
Or override the save(List)
and use the entity manager to call persist()
@Repository
public class ThingRepository extends SimpleJpaRepository<Thing, Long> {
private EntityManager entityManager;
public ThingRepository(EntityManager entityManager) {
super(Thing.class, entityManager);
this.entityManager=entityManager;
}
@Transactional
public List<Thing> save(List<Thing> things) {
things.forEach(thing -> entityManager.persist(thing));
return things;
}
}
The above code is based on the following links:
- http://www.hameister.org/SpringBootUsingIdsForBulkImports.html
- http://www.hameister.org/SpringBootBulkImportWithCrudRepository.html
- https://vladmihalcea.com/the-best-way-to-do-batch-processing-with-jpa-and-hibernate/