Search code examples
javaspring-data-jpahibernate-criteriacriteria-apicriteriaquery

entityManager.createQuery() taking lot of time to build query and bind the parameters. Performance affected


We are using Spring JPA criteria query ( javax.persistence.criteria.CriteriaQuery) to fetch data from database. We use the javax.persistence.criteria.Predicate to build the predicates. We have 1500 'OR' predicates in one query. And each predicate having 6 'AND' predicates.

SELECT (*) FROM TABLE_ABC as T1 WHERE  (t1.column1 = 'c11' AND
   t1.column2 = 'c12' AND t1.column3 = 'c13' AND t1.column4 = 'c14' AND
   t1.column5 = 'c15') 
  OR 
   (t1.column1 = 'c21' AND t1.column2 = 'c22'
   AND t1.column3 = 'c23' AND t1.column4 = 'c24' AND t1.column5 = 'c25')
   OR 
    (t1.column1 = 'c31' AND t1.column2 = 'c32'
   AND t1.column3 = 'c33' AND t1.column4 = 'c34' AND t1.column5 = 'c35').....

Earlier we were using "org.hibernate.Criteria" and using 'Conjuction' and 'Disjunction' to build the same query. This approach was working efficiently. As the "org.hibernate.Criteria" is depricated we are moving to the javax-criteriaquery package. We are facing big degradation in performance. The drill down of logs indicates that time is consumed more in the step

=> entityManager.createQuery(), Which performs following operations


  1. CriteriaCompiler.compile
  2. CriteriaQueryImpl$1.buildCompiledQuery
  3. CriteriaCompiler$1$1.bind

These operations are the more time consuming.

Is there any solution to make these execution faster? Is 'javax.persistence.criteria.CriteriaQuery' the way forward?

Please help here!

Please see code below:

@Transactional(propagation = Propagation.REQUIRES_NEW, isolation = Isolation.READ_COMMITTED)
public getData(List<DataDAO> dataReqList) {
{

    CriteriaBuilder builder = em.getCriteriaBuilder();
    CriteriaQuery<DataReq> criteriaQuery = builder.createQuery(DataReq.class);
    Root<DataReq> dataReqRoot = criteriaQuery.from(DataReq.class);
    Predicate[] predicateArr = new Predicate[dataReqList.size()];

    for (DataDAO dataReq : dataReqList) {

                    predicateArr[i] = builder.and(
                            builder.equal(dataReqRoot.get(TEST_S), dataReq.getS()),
                            builder.equal(dataReqRoot.get(TEST_T2), dataReq.getT2()),
                            builder.equal(dataReqRoot.get(K1), dataReq.getK1()),
                            builder.equal(dataReqRoot.get(K2), dataReq.getK2()),
                            builder.equal(dataReqRoot.get(TEST_P), dataReq.getP()),
                            builder.equal(dataReqRoot.get(TEST_T1),
                                    dataReq.getT1(),
                            builder.equal(dataReqRoot.get(TEST_I), dataReq.getI()));

                            i++;
    }

    List<Data> dataResultList = getResultList(builder, criteriaQuery, predicateArr);

}

private List<Data> getResultList(CriteriaBuilder builder,
            CriteriaQuery<DataReq> criteriaQuery, Predicate[] predicateArr) {
    criteriaQuery.where(builder.or(predicateArr));
    TypedQuery<DataReq> query = entityManager.createQuery(criteriaQuery);

    List<DataReq> dataReqList = null;
    try {

        dataReqList = query.getResultList();
    } catch(Exception e) {
    ...
    }

    return convertToData(dataReqList);

}

The same query with "org.hibernate.Criteria" and using 'Conjuction' and 'Disjunction' works very efficiently in milliseconds.


Solution

  • For context, depending on the database you're using, this is like a dynamic IN predicate with row value expressions. If supported, you could also write:

    WHERE (t1.column1, t1.column2, t1.column3, t1.column4, t1.column5, t1.column6) IN (
      ('c11', 'c12', 'c13', 'c14', 'c15', 'c16'),
      ('c21', 'c22', 'c23', 'c24', 'c25', 'c26'),
      ...
    )
    

    Such long IN lists will turn into problems not only in client libraries that produce dynamic SQL, but also on the server side. You mentioned bind variables, perhaps the old API you were using was not using bind variables after all, but inlined all the values into the query. I've seen that perform much better in Oracle for large sets of parameters, so this is one of the cases where inline values might be better than bind variables.

    Since you're using Hibernate, you could try enabling

    <property name="hibernate.criteria.literal_handling_mode" value="bind"/>
    

    See HHH-9576 and this answer

    A possibly even better solution using arrays

    The above would (maybe) help restore the previous performance you've experienced, but depending on your IN list size, there might be even better solutions. I've blogged about an alternative where you could use arrays instead of individual bind values, in case you're using Oracle or PostgreSQL.

    A possibly even better solution using temporary tables

    Another option that I've seen work very often is to use temporary tables of the form (assuming Oracle):

    CREATE GLOBAL TEMPORARY TABLE predicates (
      column1 VARCHAR2(100),
      column2 VARCHAR2(100), 
      column3 VARCHAR2(100), 
      column4 VARCHAR2(100), 
      column5 VARCHAR2(100), 
      column6 VARCHAR2(100)
    )
    

    And then, prior to running your query, batch insert all the various predicate values into that table and then semi join it:

    WHERE (t1.column1, t1.column2, t1.column3, t1.column4, t1.column5, t1.column6) IN (
      SELECT column1, column2, column3, column4, column5, column6
      FROM predicates
    )
    

    If you don't have temporary tables, you can try ordinary tables instead, and add a transaction_id column to it, cleaning up its contents manually after your queries.