I have a Spring Batch (v2.2.1) job that uses a partitioned step. This partitioned step accesses the database using a JpaPagingItemReader
to page through results returned. The partitions are executed within the local JVM using Spring's SimpleAsyncTaskExecutor
.
Assuming the database operations take a "long" time (long here meaning longer than the processing), my question boils down to this: what is a good rule of thumb for determining the maximum number of database connections to prevent the partitions from blocking on waiting for a connection?
My initial thoughts were that I should have at least gridSize
database connections so that each partitioned step has it's own database connection to work with plus a few extra for any overhead Spring might have (not a very scientific measure I know... hence the question).
What I observed with this configuration was that my partitioned steps would spend a lot of time blocked waiting for a database connection.
Here is an example using partitionJdbcJob
from spring-batch-samples to illustrate the scenario:
The datasource:
<bean id="dataSource" class="org.apache.commons.dbcp.BasicDataSource" destroy-method="close">
<property name="driverClassName" value="${jdbc.driver}" />
<property name="url" value="${jdbc.url}" />
<property name="username" value="${jdbc.user}" />
<property name="password" value="${jdbc.password}" />
<property name="validationQuery" value=""/>
<property name="testWhileIdle" value="false"/>
<property name="maxActive" value="7"/>
</bean>
The job, partitioner, and partitioned step:
<job id="partitionJdbcJob" xmlns="http://www.springframework.org/schema/batch">
<step id="step">
<partition step="step1" partitioner="partitioner">
<handler grid-size="5" task-executor="taskExecutor"/>
</partition>
</step>
</job>
<bean id="partitioner" class="org.springframework.batch.sample.common.ColumnRangePartitioner">
<property name="dataSource" ref="dataSource" />
<property name="table" value="CUSTOMER" />
<property name="column" value="ID" />
</bean>
<bean id="taskExecutor" class="org.springframework.core.task.SimpleAsyncTaskExecutor" />
<step id="step1" xmlns="http://www.springframework.org/schema/batch">
<tasklet>
<chunk writer="itemWriter" reader="itemReader" processor="itemProcessor" commit-interval="100" />
<listeners>
<listener ref="fileNameListener" />
</listeners>
</tasklet>
</step>
<bean id="itemReader" class="org.springframework.batch.item.database.JpaPagingItemReader" scope="step">
<property name="entityManagerFactory" ref="entityManagerFactory"/>
<property name="pageSize" value="100"/>
<property name="queryProvider">
<bean class="my.example.QueryProvider" scope="step">
<property name="minDefaultId" value="#{stepExecutionContext[minValue]}" />
<property name="maxDefaultId" value="#{stepExecutionContext[maxValue]}" />
</bean>
</property>
</bean>
Given the above configuration, here are the important numbers:
With this configuration I would expect the partitioner to partition 5 steps. Each step should be able to have it's own database connection (because the application is allowed 7 max... which seems like plenty). Then each step would page through it's work 100 records at a time, committing the work after each page.
However, I observed (using jconsole) that the threads executing my partitioned steps would frequently be blocked waiting for a database connection. Why would they block if my maximum active connections is greater than my grid size?
What I found is that in fact, you need a minimum of gridSize * 2 + 1
database connections. I came to the following conclusion as to why this is through observation (a lot of debugging). So if I am incorrect in some of my assumptions, please correct me.
Spring batch automatically handles the batch job's JDBC connections and transactions. When a Step begins, a connection is obtained and a transaction is opened. When the step finishes the transaction is committed and the connection is closed (returned to pool). For the duration of the step, the connection obtained at the beginning is considered 'in use'.
In addition to the transaction and connection that is opened for the Step, JpaPagingItemReader
also requires a transaction to do it's paging. This necessitates a new connection as well. The transaction/connection is obtained before reading a page and is committed/closed after writing.
Since (in my scenario) I am partitioning 5 steps, each one having it's own JpaPagingItemReader
, I need 5 connections for the steps as well as 5 connections for the readers (i.e. gridSize * 2
).
The extra 1 connection comes from the fact that the 'master' step is executing at the same time as the partitioned steps; this also requires a connection.
So going back to my example, I set the numbers like this:
And no more blocked threads!