JPA: what is the proper pattern for iterating over large result sets?
Page 537 of Java Persistence with Hibernate gives a solution using ScrollableResults
, but alas it's only for Hibernate.
So it seems that using setFirstResult
/setMaxResults
and manual iteration really is necessary. Here's my solution using JPA:
private List<Model> getAllModelsIterable(int offset, int max)
{
return entityManager.createQuery("from Model m", Model.class).setFirstResult(offset).setMaxResults(max).getResultList();
}
then, use it like this:
private void iterateAll()
{
int offset = 0;
List<Model> models;
while ((models = Model.getAllModelsIterable(offset, 100)).size() > 0)
{
entityManager.getTransaction().begin();
for (Model model : models)
{
log.info("do something with model: " + model.getId());
}
entityManager.flush();
entityManager.clear();
em.getTransaction().commit();
offset += models.size();
}
}
I tried the answers presented here, but JBoss 5.1 + MySQL Connector/J 5.1.15 + Hibernate 3.3.2 didn't work with those. We've just migrated from JBoss 4.x to JBoss 5.1, so we've stuck with it for now, and thus the latest Hibernate we can use is 3.3.2.
Adding couple of extra parameters did the job, and code like this runs without OOMEs:
StatelessSession session = ((Session) entityManager.getDelegate()).getSessionFactory().openStatelessSession();
Query query = session
.createQuery("SELECT a FROM Address a WHERE .... ORDER BY a.id");
query.setFetchSize(Integer.valueOf(1000));
query.setReadOnly(true);
query.setLockMode("a", LockMode.NONE);
ScrollableResults results = query.scroll(ScrollMode.FORWARD_ONLY);
while (results.next()) {
Address addr = (Address) results.get(0);
// Do stuff
}
results.close();
session.close();
The crucial lines are the query parameters between createQuery and scroll. Without them the "scroll" call tries to load everything into memory and either never finishes or runs to OutOfMemoryError.