Search code examples
javahibernatelazy-loadinglazy-initializationhibernate-batch-updates

Hibernate Batch Update - Entities are not updated


I have a batch process which is recalculating data for a set of entities. The entities list is fetched from DB by hibernate:

@Transactional(propagation = Propagation.REQUIRES_NEW)
public void recalculateUserData(Long userId){
    List<Entity> data = repository.getAllPendingRecalculation(userId);

    List<Entity> recalculated = new LinkedList();

    for (Entity entity : data){
        recalculateEntity(entity, recalculated);
        recalculated.add(entity);
        flushIfNeeded(recalculated); //every 10 records
    }
}

private void recalculateEntity(Entity entity, List<Entity> recalculated){
    //do logic
}

private void flush(){
    getSession().flush();
    getSession().clear();
}

private void flushIfNeeded(List<Entity> data) {
    int flushSize = 10
    if (data.size() % flushSize == 0){
        flush();
    }
}

When the process runs it looks like some entities are becoming detached, causing two symptoms:

  1. When trying to fetch lazy data I get an exception: org.hibernate.LazyInitializationException - could not initialize proxy - no Session.
  2. When no lazy load is needed - only the first 10 records are updated in DB, even though flushIfNeeded(...) is working ok.

On my first try, I tried to resolve it by calling session#refresh(entity) inside recalculateEntity(...) - this solved the lazy initialization issue, but the issue in #2 still occurred:

private void recalculateEntity(Entity entity){
    getSession().refresh(entity);
    //do logic
}

Since this haven't solved the issue I thought about using attach(entity) instead of refresh(entity):

private void recalculateEntity(Entity entity){
    getSession().attach(entity);
    //do logic
}

This seems to work, but my question is: Why did these entities get detached in the first place?

(I'm using Hibernate 3.6.10)


Update

As @galovics explained:

The problem is that you are clearing the whole session which holds all your managed entities, making them detached.

Hibernate batch processing documentation indicates that batch updates should be performed using ScrollableResults (which resolves these issues), but in this case I have to fetch all the data before processing it, as an entity calculation might depend on entities that were already calcualted. For example, calculating Entity#3 might require data calculated for Entity#1 & Entity#2.

For a case like this, would it be better to use Session#attach(entity) (as shown in the code), using Session#flush() without using Session#clear() or is there a better solution?


Solution

  • The problem is that you are clearing the whole session which holds all your managed entities, making them detached.

    If you are working with just part of the data, make sure you only fetch them, and then you can easily clear the whole session and fetch the next batch and do the same calculation.

    Article on LazyInitializationException just to clarify it.