Search code examples
javaspring-bootspring-repositories

Spring: Native Query in batch using query annotation in a Repository?


I've got to do upserts of thousands of records at a time, broken into batches of let's say 25 actual upserts at a time. The upsert is based on a unique constraint being service+key, and if the insert fails, it updates the value.

Note that I'm interested in the UPSERT case, not just standard inserts, as I want to avoid the extra db READ operations as well as this data is mostly static.

I'm using Spring with Postgres 14, and was thinking I'd do something like the following to do the upsert in a Repository class, but wanted to ensure that the upserts are done in batches if possible.

I've set the following spring boot properties, but don't see any log messages suggesting batching is happening:

spring.jpa.properties.hibernate.jdbc.batch_size = 25
spring.jpa.properties.hibernate.order_inserts = true

Can I do it using a method like so, or do I have to do it another way, like so possibly?

@Transactional
@Modifying
@Query(
  nativeQuery = true,
  value =
      "INSERT INTO my_table "
          + "  (service, key, value, version) "
          + "VALUES "
          + "  (:service, :key, :value, :version) "
          + "ON CONFLICT ON CONSTRAINT allow_one_value_per_service_key DO "
          + "  UPDATE "
          + "    SET "
          + "      value = excluded.value, version=my_table.version + 1 "
          + "    WHERE  "
          + "      my_table.value <> excluded.value")
void upsertValue(
      @Param("service") String service,
      @Param("key") String key,
      @Param("value") String value,
      @Param("version") int version
      );

EDIT: My goal is to only ever do DB inserts or updates in batches (single transaction per batch), and do nothing if the value hasn't changed. So for a new "service" that hasn't had any values inserted yet, I will be inserting about 5000 values at once, which I want to ensure is done in groups, to avoid overloading Postgres and causing large transaction logs.

EDIT 2: Note that if your using @DataJpaTest to test your upsert, it will fail with H2 database, as it doesn't support `INSERT ON CONFLICT(unique) UPDATE. More details here


Solution

  • Most of other answers mentioned somethings related to GenerationType.IDENTITY or the hibernate batch properties such as jdbc.batch_size etc. , but these things are only matter if you batch update the records in the hibernate ways which is through its dirty checking mechanism or EntityManager 's persist() or merge(). If you are using the native SQL to update , it has nothing to do with these settings.

    Fundamentally , batch executing native SQL requires to use JDBC 2.0 Batch API but there are no easy ways to access these API through hibernate API. So I would recommend to use JdbcTemplate which has a batchUpdate() method that will delegate to JDBC 2.0 Batch API PreparedStatement#addBatch() for the actual work.

    Given you already had the native upsert SQL , it is actually very easy to do it via JdbcTemplate.

    Firstly , define a record for my_table :

    public record MyTable(String service, String key, String value, int version){}
    

    Then implement a service for upserting them :

    @Service
    public class MyTableService {
    
        @Autowired
        NamedParameterJdbcTemplate jt;
    
        @Transactional
        public void upsert(List<MyTable> rows) {
    
            String sql = "INSERT INTO my_table "
                    + "  (service, key, value, version) "
                    + "VALUES "
                    + "  (:service, :key, :value, :version) "
                    + "ON CONFLICT ON CONSTRAINT allow_one_value_per_service_key DO "
                    + "  UPDATE "
                    + "    SET "
                    + "      value = excluded.value, version=my_table.version + 1 "
                    + "    WHERE  "
                    + "      my_table.value <> excluded.value";
    
            MapSqlParameterSource[] params = rows.stream().map(r -> {
                MapSqlParameterSource paramValues = new MapSqlParameterSource();
                paramValues.addValue("service", r.service());
                paramValues.addValue("key", r.key());
                paramValues.addValue("value", r.value());
                paramValues.addValue("version", r.version());
                return paramValues;
            }).toArray(MapSqlParameterSource[]::new);
    
            jt.batchUpdate(sql, params);
    
        }
    }
    
    

    To use it :

    List<MyTable> records = new ArrayList<>();
    records.add(new MyTable("service1", "key1", "value1", 0));
    records.add(new MyTable("service2", "key2", "value2", 0));
    records.add(new MyTable("service3", "key3", "value3", 0));
    records.add(new MyTable("service4", "key4", "value4", 0));
    myTableService.upsert(records);