Having the typical method which returns a paginated result, using CriteriaBuilder
and performing 2 queries:
We have noticed that the first query, JPA does not optimize it at all because it's using the exists (from Oracle).
Java code:
Root<Foo> from = criteriaQuery.from(Foo.class);
//... predicates
CriteriaQuery<Long> countQuery = criteriaBuilder.createQuery(Long.class)
.select(criteriaBuilder.countDistinct(from))
.where(predicates.toArray(new Predicate[predicates.size()]));
Long numberResults = entityManager.createQuery(countQuery).getSingleResult();
SQL generated query:
SELECT COUNT(t0.REFERENCE)
FROM foo t0
WHERE EXISTS (
SELECT t1.REFERENCE
FROM foo t1
WHERE ((((t0.REFERENCE = t1.REFERENCE) AND (t0.VERSION_NUM = t1.VERSION_NUM)) AND (t0.ISSUER = t1.ISSUER)) AND (t1.REFERENCE LIKE ? AND (t1.VERSION_STATUS = ?)))
);
How do I avoid using the exists
? Is there something wrong with the java code?
For different reasons, this issue and this related article enumerate some of them, EclipseLink uses EXISTS
in the countDistinct
operation implementation.
Although I can agree with you, be aware that the performance offered by EXISTS
in Oracle is in fact very dependent of the use case, and it doesn't have to be poor. Please, consider review this mythical blog entry in the Tom Kyte blob.
So my advice is, please, keep using the generated code and corresponding SQL.
If you need or want to use a different approach, a perhaps more performant way of counting the records could be fetching the ids of the entities that match the provided predicates (the actual performance in fact is mostly dependent on these predicates in fact), and count the results in memory, with Java. I mean:
CriteriaBuilder cb = entityManager.getCriteriaBuilder();
// I assume reference is String here
CriteriaQuery<String> query = cb.createQuery(String.class);
Root<Foo> root = query.from(Foo.class);
query
.select(root.get("reference"))
.distinct(true)
.where(predicates.toArray(new Predicate[predicates.size()]))
;
List<String> references = entityManager.createQuery(query).getResultList();
int count = references.size();
Although I think it is always not advisable, if the amount of data is not large, you could even fetch the results once from the database, and do the paging in memory with Java, it is straightforward using subList
, for instance.
At a final word, AFAIK other JPA providers such as Hibernate implements count in a different way: if switching the JPA provider is an option you could try using it instead.