Search code examples
javatransactionsejbjtaxa

XA Datasource 1PC optimization


I am working with JBoss EAP 6.4 (Java EE 6) and I have a question related to the way the application server is dealing with XA Datasources (through EJB / JTA) and if the 2 phase commit (2PC) is always used or if an "optimization" is applied.

Let's say I have this:

@Stateless
@TransactionAttribute(TransactionAttributeType.REQUIRED)
public class MyEjb {
   @EJB
   private MyFirstEjb first;

   @EJB
   private MySecondEjb second;

   // Transactional processing
   public void process() {
      first.processJpaStuff();
      second.processJpaStuff();
   }
}

Let's say that :

  • MyFirstEjb do JPA queries using XA Datasource 1.
  • MySecondEjb do JPA queries using XA Datasource 2.

I am using XA datasource because these EJBs can be used in other cases where 2PC is required (along with another datasource or a JMS provider).

I now would like to distinguish several cases:

  1. MyFirstEjb and MySecondEjb are deployed in the same application (EAR)
  2. MyFirstEjb and MySecondEjb are deployed in separate applications (EARs) within the same application server
  3. MyFirstEjb and MySecondEjb are deployed within different applications servers

and sub-cases:

a) XA Datasource 1 = XA Datasource 2

b) XA Datasource 1 != XA Datasource 2 (same database)

c) XA Datasource 1 != XA Datasource 2 (different database)

I guess b) and c) are managed the same way. There is a global transaction and each datasource collaborate with the XA transaction manager. A 2PC is applied.

What about cases 1.a) and 2.a) ? Since both are eventually using the same datasource, I guess there is some kind of optimization that does not require a global 2PC transaction to be processed? If yes, is there any official (JTA / JBoss / ...) link that explains this? Is it the same thing with all application servers / implementations?

Thanks


Solution

  • It depends.

    The JTA (transaction coordinator) knows nothing about EJBs or applications. It's concerned only with XAResources and the associated transaction branches. The normal case is that the JCA managing the connection pool used by the JPA for the entity beans, will provide the JTA with one XAResource per datasource used. The JTA assigns each a different branch qualifier under the same global tx id.

    During transaction termination the JTA prepares each XAResource and it's at this point that the optimization kicks in. If the db engine detects that it has multiple branches (connections/XAResources) for the same global tx, it may return PREPARED from the first XAResource, but READ_ONLY from the remaining resource(s). Assuming the tx has as a result only one PREPARED resource and the rest are all read-only, it can then optimize the remaining part of the termination accordingly. see e.g.

    http://narayana.io/docs/product/#two-phase-variants

    https://docs.oracle.com/cd/B10501_01/java.920/a96654/xadistra.htm#1061004

    Note that depending on the vendor, 'db engine' and 'database' are not exactly the same thing. Some systems will host multiple dbs on the same server and allow the optimization to work across them, whereas others may treat each as a separate transaction engine scope and not optimize such cases. Datasources may also differ only in userid/schema used for the connection, relying on the permissions/schema namespacing to isolate applications without requiring a distinct database for the purpose. The optimization almost always works in such cases.

    In some cases where the apps use the same XADatasource, the JCA registers just one XAResource with the JTA, potentially allowing it to use the more aggressive 1PC optimization instead.

    Whilst it's true that connections may switch between local and XA transaction context, it's currently not possible for the JTA to take advantage of this. Since resources are enlisted only on demand, the system doesn't know how many are going to participate in the transaction until it reaches the termination stage. The JTA spec group has previously discussed allowing configuration, similar to the way tx timeouts are set, which would allow the application to indicate at begin that a tx is expected to be single resource, or more generally to list what XAResources it's expected to contain. That information would allow the JTA to drive the resource in local tx mode rather than XA mode where appropriate, eliding the start/end/prepare protocol calls. It would also remove the need to manually optimize such cases by deploying both XA and non-XA datasources for the same database in an application. It's not currently on the roadmap though.