Search code examples
javamysqlhibernatejakarta-eewildfly-10

Batch insert causing failure


I have a task where I need to insert a lot of data into the database using hibernate. I am currently testing with inserting 500,000 entities which each have one relation, so 1,000,000 inserts in total. Based on this guide I have created the following code which actually works. All the data is inserted and committed without errors.

import javax.annotation.Resource;
import javax.ejb.*;
import javax.persistence.EntityManager;
import javax.persistence.PersistenceContext;
import javax.transaction.*;
import javax.xml.stream.XMLStreamException;
import javax.xml.transform.TransformerException;
import java.io.File;
import java.io.IOException;
import java.text.ParseException;

@Stateless
@TransactionManagement(TransactionManagementType.BEAN)
public class StackoverflowExample {
    @PersistenceContext
    private EntityManager entityManager;

    @Resource
    private SessionContext sessionContext;

    @EJB
    private XmlProcessorFactory xmlProcessorFactory;

    @EJB
    private TaskManagerBean taskManagerBean;

    public void processFile(String[] args, Task task) throws HeuristicRollbackException, RollbackException, HeuristicMixedException, SystemException, IOException, TransformerException, ParseException, XMLStreamException, NotSupportedException {
        UserTransaction tx = null;

        XmlProcessor instance = xmlProcessorFactory.getInstance(new File("data.xml"));

        XmlElement nextElement = instance.getNextElement();
        int i = 0;
        int batchSize = 50;
        entityManager.setProperty("hibernate.jdbc.batch_size", batchSize);
        tx = sessionContext.getUserTransaction();
        tx.begin();
        while (nextElement != null) {
            Entry entry = new Entry(nextElement.getUserReference(), nextElement.getXml());

            entityManager.persist(entry);

            if (i % batchSize == 0) {
                entityManager.flush();
                entityManager.clear();
            }

            nextElement = instance.getNextElement();
            i++;
        }

        task.setStatus(status);
        task.setEndTime(now());
        // This gives the OutOfMemoryError
        entityManager.merge(task);

        tx.commit();
    }
}

This will fail in the line where I call taskManagerBean.update() with the following error:

2017-03-31 08:49:30,212 ERROR [org.jboss.as.ejb3.invocation] (EJB default - 3) WFLYEJB0034:
EJB Invocation failed on component TaskManagerBean for method public void
TaskManagerBean.update(Task,TaskStatus):
javax.ejb.EJBTransactionRolledbackException: org.hibernate.exception.GenericJDBCException:
could not load an entity: [Task#3]
    at org.jboss.as.ejb3.tx.CMTTxInterceptor.handleInCallerTx(CMTTxInterceptor.java:159)
    at org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInCallerTx(CMTTxInterceptor.java:256)
    ...
    at TaskManagerBean$$$view18.update(Unknown Source)
    at StoreEntriesBean.processFile(StoreEntriesBean.java:117)
    ...
    at org.jboss.threads.JBossThread.run(JBossThread.java:320)
Caused by: javax.persistence.PersistenceException: org.hibernate.exception.GenericJDBCException: could not load an entity: [Task#3]
    at org.hibernate.jpa.spi.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1692)
    ...
    at org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInCallerTx(CMTTxInterceptor.java:254)
    ... 104 more
Caused by: org.hibernate.exception.GenericJDBCException: could not load an entity: [Task#3]
    at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:47)
    ...
    at org.hibernate.jpa.spi.AbstractEntityManagerImpl.merge(AbstractEntityManagerImpl.java:1161)
    ... 135 more
Caused by: java.sql.SQLException: Error
    at org.jboss.jca.adapters.jdbc.WrappedConnection.checkException(WrappedConnection.java:1972)
    ...
    at org.hibernate.loader.Loader.loadEntity(Loader.java:2204)
    ... 155 more
Caused by: java.lang.OutOfMemoryError: Java heap space
    at java.util.Arrays.copyOfRange(Arrays.java:3664)
    ...
    at org.hibernate.event.internal.DefaultLoadEventListener.doLoad(DefaultLoadEventListener.java:478)

Runtime.getRuntime().freeMemory() returns 71540896 right before the error.

I start Wildfly with -Xmx2048m and a quick glance at memory usage shows that it only uses less than half of it.

I have tried committing the inserted Entries after every 1,000 inserts. This triggers the TaskManagerBean and it will fail the same way after a few updates. In a few occasions I have also seen this error after I finished the huge batch job and the owning entity was to be updated.

I have tried with a file with just 5,000 entries and the entire process works fine.

Is this a bug in the MySQL driver or am I doing something wrong here?


Solution

  • The call to EntityManger.merge() is what causes the error. I am not 100 % familiar with Hibernate but apparently merge will in this case fetch the entire relation from Task to Entry, which in this case was a collection with 500,000 entries - even though this relation is Lasy loaded.

    I replaced merge with a Entity.find(Task.class, taskId) and set the status on that instance instead and that solved my problem.

    In the meantime I have introduced Java Batch Processing which I can only recommend. It avoids having to write batch jobs yourself.