JPA Entity Manager transaction scope vs extended scope

We are using EclipseLink implementation for JPA in our web application. We use an EntityManager with transcation scope (JTA transcaction type), inside a Stateless bean.

From what I have found from several sources and after discussions with my colleges, that's the reccomended way to go with entitymanager's scope. But from my personal experiments while developing, I found several problems. For example, if you call entityManager.find(class, id), twice in a method, it will return two different objects and this is what expected to happen, because entity manager is transaction scope, and we have two seperate transactions. So this way uses more memory.

My questions is, why we shouldn't use an entity manager with extended scope inside a statefull bean. Aren't there any advantages? What problemns can it cause?

What we use:

@Stateless
@LocalBean
public class SessionBean{
    @PersistanceContext(unitName = "name", type = PersistenceContextType.TRANSACTION)
    EntityManager entityManager;
}

What I want to change to:

@Stateful
@LocalBean
public class SessionBean{
    @PersistanceContext(unitName = "name", type = PersistenceContextType.EXTENDED)
    EntityManager entityManager;
}

Solution

On the contrary! In general, transaction-scoped entity managers tend to be more memory efficient than extended entity managers.

First, let's name the main concept in this discussion: the persistence context.

According to the specification: "A persistence context is a set of entity instances in which for any persistent entity identity there is a unique entity instance".

Understanding the way persistence contexts interacts with JTA transactions is an essential knowledge for a JPA programmer. In both proposed scenarios, we're talking about container-managed persistence contexts. So how do they work?

Transaction-scoped persistence contexts

As the name suggests, a transaction-scoped persistence context is bound to a transaction and its lifecycle. It's created within a transaction and will be closed when the transaction ends. And, most importantly in our context, it's the only persistence context created for the transaction. This means that even if more than one EntityManager instance is created during the transaction, they will share the same persistence context.

This means that different service beans that inject the EntityManager interface share the same set of entities (the same persistence context), despite having different EntityManager instances assigned to them. So calling entityManager.find(class, id) twice, thrice - or more, results in the same instance being returned. This is true even if the find() method call is made from a different EntityManager instance. As long as the application is running the same transaction, the same entity instance is returned - regardless from which EntityManager it was retrieved.

Finally, being bound to a transaction, as stated earlier, implies that the persistence context is closed when the transaction closes. This means the persistence context is, first, flushed, and, then, cleared once the transaction ends. This behavior represents a huge gain in memory allocation compared to extended persistence contexts. This is because once the persistence context is cleared, entity instances retrieved during the transaction are garbage-collected.

Let's see the shared persistence context in an example.

@Entity
public class Person {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private long id;
    private String name;

    /* constructors, getters and setters omitted */
}

@Stateless
public class StatelessBeanA {

    @PersistenceContext
    private EntityManager em;

    public void changePersonName(long id, String newName) {
        Person person = em.find(Person.class, id);
        person.setName(newName);
    }

    public Person getPerson(long id) {
        return em.find(Person.class, id);
    }

    public EntityManager getEm() {
        return em;
    }
}

@Stateless
public class StatelessBeanB {

    @PersistenceContext
    private EntityManager em;

    public void printsPersonName(long id) {
        var person = em.find(Person.class, id);
        System.out.println(person.getName());
    }

    public Person getPerson(long id) {
        return em.find(Person.class, id);
    }

    public EntityManager getEm() {
        return em;
    }
}

@Stateless
public class Controller {

    @PersistenceContext
    EntityManager em;
    @EJB
    StatelessBeanA serviceA;
    @EJB
    StatelessBeanB serviceB;

    public void execute() {
        // prints 'false'
        System.out.println(Objects.equals(serviceA.getEm(), serviceB.getEm()));

        Person person = new Person("oldName");
        em.persist(person);
        final long id = person.getId();

        serviceA.changePersonName(id, "newName");

        // prints 'newName'
        serviceB.printsPersonName(id);

        var p1 = serviceA.getPerson(id);
        var p2 = serviceB.getPerson(id);

        // prints 'true'
        System.out.println(Objects.equals(p1, p2));
    }
}

Extended persistence contexts

To quote the specification again:

"A container-managed extended persistence context can only be initiated within the scope of a stateful session bean. It exists from the point at which the stateful session bean that declares a dependency on an entity manager of type PersistenceContextType.EXTENDED is created, and is said to be bound to the stateful session bean." ... "The persistence context is closed by the container when the @Remove method of the stateful session bean completes (or the stateful session bean instance is otherwise destroyed)."

A major disadvantage of indiscriminately using extended persistence contexts is the fact the persistence context will exist for the entire lifecycle of the stateful session bean. In other words, the entity instances stored in the persistence context will not be dropped until the bean is dropped.

Another disadvantage is the complex transaction flow to be observed during its use. Imagine a situation where a stateful session bean is called from a stateless session bean. Imagine also the stateless bean, before calling the stateful bean, used a transaction-scoped entity manager. This will create a transaction-scoped persistence context for the entity manager in the stateless bean. Imagine now the stateful bean uses an extended entity manager. When the stateless session bean calls the stateful session bean, an exception is thrown. This is because when the container tries to assign the extended persistence context to the current transaction, there is already a persistence context assigned to the transaction.

So back to your question. If I understand correctly, the main reason you want to use extended persistence contexts is their ability to keep retrieved entity instances in memory. Well, despite being memory-inefficient (many unused entities are kept in memory), it could be argued that at least it is faster. However, even this advantage can be questioned, since the persistence providers tend to implement different layers of cache.

At last, I'd like to leave the chapter 6 of the book Pro JPA 2 in Java EE 8 as a great reference on the subject.