Search code examples
javasqlgarbage-collectionbatch-processinghsqldb

Java HSQLDB - Bulk Batch Insert: OutOfMemoryError: GC overhead limit exceeded


I have 57 millions strings in an ArrayList, which I want to insert into an HSQLDB. But i always run out of memory and get an "java.lang.OutOfMemoryError: Java heap space" or "OutOfMemoryError: GC overhead limit exceeded" error during the process.

I tried any solution that was suggested here.

        DateFormat df = new SimpleDateFormat("dd.MM.yyyy HH:mm:ss");
        String now = df.format(new Date());
        long iteration = 0;
        int i, size, totalSize;
        size = totalSize = values.size();
        PreparedStatement statement = connection.prepareStatement("INSERT INTO primes (P_VALUE, DATE_FOUND) VALUES (?, ?);");
        while (size > 0) {
            for (i = 0; i < 1000000 && i < size; i++) {
                iteration++;
                statement.setString(1, values.get(i));
                statement.setString(2, now);
                statement.addBatch();
            }
            values = values.subList(i, size);
            size = values.size();
            statement.executeBatch();
        }
        statement.closeOnCompletion();

I am pretty sure the problem is connected with the Java-Statement, because changing the for-loop condition doesn't change it.

I tried

  • smaller and greater batch sizes
  • overwriting the statement after every executeBatch()
  • statement close after every executeBatch()
  • commiting after every executeBatch()

Solution

  • It looks like you are using in-memory tables and the memory runs out when you insert a lot of rows.

    Try CREATE CACHED TABLE with a file-based database. You can then try different batch sizes.

    You must also commit after each batch. All the inserted rows are kept in memory until you commit.