We are using Spring JPA criteria query ( javax.persistence.criteria.CriteriaQuery) to fetch data from database. We use the javax.persistence.criteria.Predicate to build the predicates. We have 1500 'OR' predicates in one query. And each predicate having 6 'AND' predicates.
SELECT (*) FROM TABLE_ABC as T1 WHERE (t1.column1 = 'c11' AND
t1.column2 = 'c12' AND t1.column3 = 'c13' AND t1.column4 = 'c14' AND
t1.column5 = 'c15')
OR
(t1.column1 = 'c21' AND t1.column2 = 'c22'
AND t1.column3 = 'c23' AND t1.column4 = 'c24' AND t1.column5 = 'c25')
OR
(t1.column1 = 'c31' AND t1.column2 = 'c32'
AND t1.column3 = 'c33' AND t1.column4 = 'c34' AND t1.column5 = 'c35').....
Earlier we were using "org.hibernate.Criteria" and using 'Conjuction' and 'Disjunction' to build the same query. This approach was working efficiently. As the "org.hibernate.Criteria" is depricated we are moving to the javax-criteriaquery package. We are facing big degradation in performance. The drill down of logs indicates that time is consumed more in the step
=> entityManager.createQuery(), Which performs following operations
These operations are the more time consuming.
Is there any solution to make these execution faster? Is 'javax.persistence.criteria.CriteriaQuery' the way forward?
Please help here!
Please see code below:
@Transactional(propagation = Propagation.REQUIRES_NEW, isolation = Isolation.READ_COMMITTED)
public getData(List<DataDAO> dataReqList) {
{
CriteriaBuilder builder = em.getCriteriaBuilder();
CriteriaQuery<DataReq> criteriaQuery = builder.createQuery(DataReq.class);
Root<DataReq> dataReqRoot = criteriaQuery.from(DataReq.class);
Predicate[] predicateArr = new Predicate[dataReqList.size()];
for (DataDAO dataReq : dataReqList) {
predicateArr[i] = builder.and(
builder.equal(dataReqRoot.get(TEST_S), dataReq.getS()),
builder.equal(dataReqRoot.get(TEST_T2), dataReq.getT2()),
builder.equal(dataReqRoot.get(K1), dataReq.getK1()),
builder.equal(dataReqRoot.get(K2), dataReq.getK2()),
builder.equal(dataReqRoot.get(TEST_P), dataReq.getP()),
builder.equal(dataReqRoot.get(TEST_T1),
dataReq.getT1(),
builder.equal(dataReqRoot.get(TEST_I), dataReq.getI()));
i++;
}
List<Data> dataResultList = getResultList(builder, criteriaQuery, predicateArr);
}
private List<Data> getResultList(CriteriaBuilder builder,
CriteriaQuery<DataReq> criteriaQuery, Predicate[] predicateArr) {
criteriaQuery.where(builder.or(predicateArr));
TypedQuery<DataReq> query = entityManager.createQuery(criteriaQuery);
List<DataReq> dataReqList = null;
try {
dataReqList = query.getResultList();
} catch(Exception e) {
...
}
return convertToData(dataReqList);
}
The same query with "org.hibernate.Criteria" and using 'Conjuction' and 'Disjunction' works very efficiently in milliseconds.
For context, depending on the database you're using, this is like a dynamic IN
predicate with row value expressions. If supported, you could also write:
WHERE (t1.column1, t1.column2, t1.column3, t1.column4, t1.column5, t1.column6) IN (
('c11', 'c12', 'c13', 'c14', 'c15', 'c16'),
('c21', 'c22', 'c23', 'c24', 'c25', 'c26'),
...
)
Such long IN
lists will turn into problems not only in client libraries that produce dynamic SQL, but also on the server side. You mentioned bind variables, perhaps the old API you were using was not using bind variables after all, but inlined all the values into the query. I've seen that perform much better in Oracle for large sets of parameters, so this is one of the cases where inline values might be better than bind variables.
Since you're using Hibernate, you could try enabling
<property name="hibernate.criteria.literal_handling_mode" value="bind"/>
See HHH-9576 and this answer
The above would (maybe) help restore the previous performance you've experienced, but depending on your IN
list size, there might be even better solutions. I've blogged about an alternative where you could use arrays instead of individual bind values, in case you're using Oracle or PostgreSQL.
Another option that I've seen work very often is to use temporary tables of the form (assuming Oracle):
CREATE GLOBAL TEMPORARY TABLE predicates (
column1 VARCHAR2(100),
column2 VARCHAR2(100),
column3 VARCHAR2(100),
column4 VARCHAR2(100),
column5 VARCHAR2(100),
column6 VARCHAR2(100)
)
And then, prior to running your query, batch insert all the various predicate values into that table and then semi join it:
WHERE (t1.column1, t1.column2, t1.column3, t1.column4, t1.column5, t1.column6) IN (
SELECT column1, column2, column3, column4, column5, column6
FROM predicates
)
If you don't have temporary tables, you can try ordinary tables instead, and add a transaction_id column to it, cleaning up its contents manually after your queries.