I am extending this question Spring Batch Performance issue when run two independent jobs sequentially? further here:
We're seeing behavior of Postgres v9.6 where we're trying to save the data having null values using Batch updates taking a very long time to save the data.
Is there any way if we can fixed it either from database side or from Spring Boot side ?
Sample Query -
"INSERT INTO test.ACCT "
+ "(rec_type, acct_type_cd, src_acct_id, stat_cd,stat_dttm, ........, "
+ "..................) "
+ "VALUES(?, ? , ?, )";
Note - For security reasons can't add column name as is.
We've the table details and query like below
Total no. of rows we're inserting in batches are 40K and in the chunk of 2500 records. Code List<Map<String, Object>>
batchValues = new ArrayList<>(items.size());
for(Employee emp: items) {
batchValues.add(new MapSqlParameterSource()
.addValue("", emp.getXXXXX() == null ? "": emp.getXXXXX(), JDBCType.VARCHAR.getVendorTypeNumber())
.addValue("", emp.getXXXXX() == null ? "": emp.getXXXXX(), JDBCType.VARCHAR.getVendorTypeNumber())
.addValue("", emp.getXXXXX() == null ? "": emp.getXXXXX() , JDBCType.VARCHAR.getVendorTypeNumber())
.addValue("", emp.getXXXXX() == null ? "": emp.getXXXXX(), JDBCType.VARCHAR.getVendorTypeNumber())
.addValue("", emp.getXXXXX(), JDBCType.DATE.getVendorTypeNumber())
.addValue("", emp.getXXXXX()== null ? "": emp.getXXXXX(), JDBCType.VARCHAR.getVendorTypeNumber())
.addValue("", emp.getXXXXX() == null ? "": emp.getXXXXX(), JDBCType.VARCHAR.getVendorTypeNumber())
.addValue("", emp.getXXXXX() == null ? "": emp.getXXXXX(), JDBCType.VARCHAR.getVendorTypeNumber())
........
........
........
........
.getValues());
}
try {
int[] updateCounts = namedJdbcTemplate.batchUpdate(SQL, batchValues.toArray(new Map[items.size()]));
} catch (Exception e) {
log.error("Error occurred in BatchUpdate ##");
throw new GenericException(e.getMessage(),this.getClass().getSimpleName());
}
batch Jobs are running sequential, first is truncate (which is fast) and other insert in batches (having more null values) eating up the performance.
Appreciate for the help from M. Deinum and a_horse_with_no_name, always provide great help.
As suggested here Slow insert on PostgreSQL using JDBC and Section 5.3.2 - https://docs.spring.io/spring/docs/current/spring-framework-reference/data-access.html#jdbc-batch-list.
In such a scenario, with automatic setting of values on an underlying PreparedStatement, the corresponding JDBC type for each value needs to be derived from the given Java type. While this usually works well, there is a potential for issues (for example, with Map-contained null values). Spring, by default, calls ParameterMetaData.getParameterType in such a case, which can be expensive with your JDBC driver. You should use a recent driver version and consider setting the spring.jdbc.getParameterType.ignore property to true (as a JVM system property or in a spring.properties file in the root of your classpath) if you encounter a performance issue — for example, as reported on Oracle 12c (SPR-16139).
Alternatively, you might consider specifying the corresponding JDBC types explicitly, either through a 'BatchPreparedStatementSetter' (as shown earlier), through an explicit type array given to a 'List<Object[]>' based call, through 'registerSqlType' calls on a custom 'MapSqlParameterSource' instance, or through a 'BeanPropertySqlParameterSource' that derives the SQL type from the Java-declared property type even for a null value.
I've created spring.properties
and added spring.jdbc.getParameterType.ignore=true
solved my issues its now taking just 7-10 seconds to load 1800 records to 3 different tables 10 columns having 5 columns as NULL value.