Search code examples
javastored-proceduresjavadbdbconnection

Parse large text files and move the data into a database


I have a quite big text file around 1.5Gb. I have to parse the file line by line and insert the lines into a Derby database. I read a lot of forum regarding the performance and how to parse the file etc. My issue is that i benchmarked all my processes and it would take to read and parse a line for 1ms, how ever i have to make sure that the line I'm trying to insert is not exists if it is then i have to make some update on it. This part of the process is taking around 9ms.

In total that 10 ms which is really much regarding that the file contains around 10 million rows.

I'm using PreparedStatement for the querys.

Is there any way i can speed up the Query part of my code?


Solution

  • Did you turn of Autocommit ?

    dbConnection.setAutoCommit(false);
    

    Use batch insert instead of one by one like here:

        Connection dbConnection = null;
        PreparedStatement preparedStatement = null;
    
        String insertTableSQL = "INSERT INTO DBUSER"
                + "(USER_ID, USERNAME, CREATED_BY, CREATED_DATE) VALUES"
                + "(?,?,?,?)";
    
        try {
            dbConnection = getDBConnection();
            preparedStatement = dbConnection.prepareStatement(insertTableSQL);
    
            dbConnection.setAutoCommit(false);
    
            preparedStatement.setInt(1, 101);
            preparedStatement.setString(2, "mkyong101");
            preparedStatement.setString(3, "system");
            preparedStatement.setTimestamp(4, getCurrentTimeStamp());
            preparedStatement.addBatch();
    
            preparedStatement.setInt(1, 102);
            preparedStatement.setString(2, "mkyong102");
            preparedStatement.setString(3, "system");
            preparedStatement.setTimestamp(4, getCurrentTimeStamp());
            preparedStatement.addBatch();
    
            preparedStatement.setInt(1, 103);
            preparedStatement.setString(2, "mkyong103");
            preparedStatement.setString(3, "system");
            preparedStatement.setTimestamp(4, getCurrentTimeStamp());
            preparedStatement.addBatch();
    
            preparedStatement.executeBatch();
    
            dbConnection.commit();
    
            System.out.println("Record is inserted into DBUSER table!");
    
        } catch (SQLException e) {
    
            System.out.println(e.getMessage());
            dbConnection.rollback();
    
        } finally {
    
            if (preparedStatement != null) {
                preparedStatement.close();
            }
    
            if (dbConnection != null) {
                dbConnection.close();
            }
    
        }
    

    Have a look in : https://builds.apache.org/job/Derby-docs/lastSuccessfulBuild/artifact/trunk/out/tuning/tuningderby.pdf