Search code examples
javaquestdbpg-jdbc

How to stream large results sets from QuestDB to java application?


I have an issue with PG Driver in Java. Usually I need to stream around 50-60 million rows from db to my java application, however (internally) the PG Driver creates a lot of garbage and all the Garbage Collections slow down my application. The garbage comes from ResultSet.getDouble() or ResultSet.getTimestamp() .

Is there any alternative and much more GC friendly way to stream data from QuestDB to a Java application ?


Solution

  • You'll get better PG driver performance for large dataset transfer by setting the following connection properties:

    • binaryTransfer to 'true'
    • preferQueryMode to 'prepared' or 'extendedForPrepared'
    • prepareThreshold to '-1' (use server side cursor on first run)

    to trigger binary transfer (thus reduce the number of temporary objects), and:

    • disable autocommit on connection
    • execute query as prepared statement
    • set fetch size on the statement prior to executeQuery() call

    to limit result set caching done by PG driver.

    In binary mode, ResultSet.getTimestamp() still allocates an object for each row, so a workaround is to cast it to long in query and transform to timestamp (java.sql.Timestamp.setTime() & java.sql.Timestamp.setNanos()) on client side yourself. With setup described above transfer of 60 mil rows of 3 columns takes about 10s. With default settings it's more than 5 times longer.

    Example:

    Properties properties = new Properties();
    properties.setProperty("user", "admin");
    properties.setProperty("password", "quest");
    properties.setProperty("binaryTransfer", "true");
    properties.setProperty("preferQueryMode", "extended");
    properties.setProperty("prepareThreshold", "-1");
    
    try (final Connection connection = DriverManager.getConnection("jdbc:postgresql://127.0.0.1:8812/qdb", properties)) {
        connection.setAutoCommit(false);
    
        try (Statement create = connection.createStatement()) {
            create.execute(("create table x as (select rnd_double() d, cast(x as timestamp) ts, x l from long_sequence(60000000))"));
            connection.commit();
        }
    
        long start = System.currentTimeMillis();
        try (PreparedStatement select = connection.prepareStatement("select d, cast(ts as long), l from x")) {
            select.setFetchSize(10000);
    
            double sumD = 0;
            long sumTs = 0;
            long sumL = 0;
    
            try (ResultSet resultSet = select.executeQuery()) {
                while (resultSet.next()) {
                    sumD += resultSet.getDouble(1);
                    sumTs += resultSet.getLong(2);
                    sumL += resultSet.getLong(3);
                }
            }
            System.out.println(sumD + ", " + sumTs + ", " + sumL);
        }
        long end = System.currentTimeMillis();
        System.out.println(end - start);
    }