Search code examples
mysqljpaconcurrencyeclipselinkjpql

Updating a bunch of rows is it transaction based or row based?


I've a table with a column that needs to be constantly recomputed and I want this table to be scallable. Users must be able to write on it as well.

It's difficult to test this type of things without having a server and concurrent users, at least I don't know how. So is one of those two options viable ?

@ApplicationScoped
public class Abean {
   @EJB
   private MyService myService;
   @Asynchronous
   public void computeTheData(){
      long i = 1;
      long numberOfRows = myService.getCountRows(); // gives the number of row in the table
      while(i<numberOfRows){
        myService.updateMyRow(i);
      }
      computeTheData(); // recursion so it never stops, I'm wondering if this wouldn't spawn more threads and if it would be an issue.
   }
}

public class MyService implements MyServiceInterface{
    ...
    public void updateMyRows(int row){
       Query query = em.createQuery("SELECT m FROM MyEntity WHERE m.id=:id");
       Query.setParameter("id", row);
       List<MyEntity> myEntities = (MyEntity) query.getResultList();
       myEntity.computeData();
    }
}

VS

@ApplicationScoped
public class Abean {
   @EJB
   private MyService myService;
   @Asynchronous
   public void computeTheData(){
      myService.updateAllRows();
   }
}

public class MyService implements MyServiceInterface{
        ...
    public void updateAllRows(int page){
       Query query = em.createQuery("SELECT m FROM MyEntity");
       List<MyEntity> myEntities = (MyEntity) query.getResultList();
       myEntity.computeData();
    }
}

Is any of this viable ? I'm using mysql and the engine for tables is innoDB.


Solution

  • You should use pessimistic locking to lock modified rows before update, so that manual modifications by a user are not in conflict with background updates. If you did not use locking, your user's modifications would sometimes be rolled back, if they collide with background job having modified the same row.

    Also, with pessimistic locking, your user may encounter rollback if her transaction waits to acquire the lock for longer than timeout happends. To prevent this, you should make all transactions, which use pessimistic locks, as short as possible. Therefore, the background job should create a new transaction for every row or small group of rows, if it may run longer than reasonable time. Locks are released only after the transaction finishes (User will wait until lock is released).

    Example of how your MyService could look like, running every update in separate transaction (in reality, you may run multiple updates in batch in single transaction, passing list or range of ids as parameter to updateMyRows):

    public class MyService implements MyServiceInterface{
            ...
        @TransactionAttribute(TransactionAttributeType.REQUIRES_NEW) // this will create a new transaction when running this method from another bean, e.g. from Abean
        public void updateMyRows(int row){
           TypedQuery<MyEntity> query = em.createQuery(SELECT m FROM MyEntity WHERE m.id=:id", MyEntity.class);
           query.setParameter("id", row);
           query.setLockMode(LockModeType.PESSIMISTIC_WRITE); // this will lock all entities retrieved by the query
           List<MyEntity> myEntities = query.getResultList();
           if (!myEntities.isEmpty()) {
             myEntities.get(0).computeData();
           }
        }
    }
    

    When you use only id in where condition, you may consider em.find(row, MyEntity.class, LockModeType.PESSIMISTIC_WRITE).computeData() instead of using a query (add null pointer check after em.find())

    Other notes:

    It is not clear from the question how you trigger the background job. Running the job infinitely, as you wrote in the example, would on one hand NOT create additional threads (as you call the methond on the same bean, annotations are not considered recursively). On the other hand, if there is an exception, your background job should at least handle exceptions so that it will not be stopped. You may also want to add some wait time between subsequent executions.

    It is better to run background jobs as scheduled jobs. One possible option is @Schedule annotation instead of @Asynchronous. You may specify the frequency, in which the job will be executed in background. It is then good to check in the beginning of your job, whether previous execution is finished. Another option with Java EE 7 is to use ManagedScheduledExecutorService to trigger a background job periodically at specified interval.