spring batch-processing spring-batch spring-integration multi-tenant

Spring Batch with multi tenancy

How do we define spring batch jobs to run against multiple tenants?

I have setup to run a sequence of jobs in order every night against one database schema currently. All the jobs currently read files from a location and insert to database.The batch configuration was very basic where I defined a data-source, transaction manager and mapped job-repository to it. My jobs will point to this repository and transaction manager. Also I am currently persisting batch meta data information in database.

My new requirement is to able to run the same jobs (executed in order) against multiple tenants. Each tenants data can live in same database server but different schema or even different database servers. My questions are

1) Do we store the batch specific metadata information for all the tenants in one common database or each tenant database should have its own?

2) My understanding is that we need a data-source per tenant so that jobs specific to this tenant will have access to database to store data read from files. Does spring batch repository should also point to current data-source when executing jobs for that tenant?

3) We are planning to start all tenants [jobs] paralleled meaning JOB1 can be running at the same time for the all the tenants. At this time I am still not sure how to manage job-repository , data-source, transaction manage when these tenants are running with each associated to a different data-source.

4) At the top of my head all I am thinking is to duplicate my existing configuration for each tenant with own job-repositoyy pointing to tenant specific data-source and transaction manager. This is the last thing I would implement if there is no other way to define the same dynamically with out duplicating.

If any body has solved or has any ideas on how to approach to a solution please share. A sample config should help.

Solution

I was involved in building a SaaS application where you needed to do something similar but not exactly using Spring Batch.

The main idea for you is to:

a. Define a master database where you would store all the configuration specific data say suppose you have a table which maps your tenant name, information and datasource configuration.

b. Start your application and read this data source and maintain a local cache at your server end with key as your tenant name and value as the tenant information (data source etc.)

c. Maintain a thread local with you, example:

public class TenantThreadLocalContext
{
    public static final ThreadLocal<TenantInformation> threadLocal = new ThreadLocal<TenantInformation>();

    public static void set(TenantInformation tenantInformation)
    {
        threadLocal.set(tenantInformation);
    }

    public static void unset()
    {
        threadLocal.remove();
    }

    public static TenantInformation get()
    {
        return threadLocal.get();
    }

}

d. Whenever you are starting any thread to begin your processing (batch processing)set this thread local with the tenant information so that each of this thread knows that it is associated to which tenant.

e. Finally at the time of database processing you could see that the thread has what data source and you could use this data source to make a connection.

If you are using Hibernate then you are lucky as Hibernate 4 has done all this for you. Refer: this. If you need help in hibernate configurations, etc, then probably I could help you out there as well.