Using the following CriteriaBuilder returns results within 1 second when I have a map of 1 or 2 entries, however when it reaches 3 the performance of the query drops. Then adding a 4th entry drops the performance even more to the point where I've not actually seen it return a result set.
I have tried to use sub-queries instead but it seems to come across a similar issue where 2 EXISTS sub-queries will return data within a reasonable time but 3 or more will result in a large performance hit.
I have noticed that without the ORDER BY the query works great even with 3 entries ( 1 second or less ), however as soon as I add it back on the same 3 entries search takes over 30 seconds.
Here is the builder:
CriteriaBuilder builder = em.getCriteriaBuilder();
CriteriaQuery<Study> q = builder.createQuery(Study.class);
Root<Study> rootStudy = q.from(Study.class);
List<Predicate> pList = new ArrayList<Predicate>( searchMap.size() );
q.select(rootStudy);
Join<Study, Demographics> dem = rootStudy.join( "demographics" );
Join<Study, MetaData> met = rootStudy.join( "metadata" );
// add all conditions
for (Map.Entry<XmlTagHashKey, String> entry : searchMap.entrySet() ) {
MapJoin< Demographics, XmlTagHashKey, Field2 > mapJoin = demJoin.joinMap( "fields" );
Path<String> attributePath = mapJoin.get( "value" );
Predicate p = builder.and( builder.equal( mapJoin.key(), entry.getKey() ),
builder.like( attributePath, "%" + entry.getValue() + "%" ));
pList.add( p );
}
q.where( pList.toArray(new Predicate[]{}) );
q.orderBy( builder.desc( met.<String>get( XMLTagEnum.bbrad_status_updated.name() ) ) );
TypedQuery<Study> typedQuery = em.createQuery( q );
typedQuery.setMaxResults( 100 );
return getResultsList(typedQuery);
Here is the SQL that is created from the above builder when I have 3 map entries:
SELECT t1.id,
t1.study_id,
t1.demographics_id,
t1.metadata_id
FROM demographics_has_fields t8,
field t7,
demographics_has_fields t6,
field t5,
demographics_has_fields t4,
field t3,
demographics t2,
study t1,
metadata t0
WHERE ( ( ( ( ( t6.xmltag = ? )
AND t5.value LIKE ? )
AND ( ( t4.xmltag = ? )
AND t3.value LIKE ? ) )
AND ( ( t8.xmltag = ? )
AND t7.value LIKE ? ) )
AND ( ( ( ( ( t2.id = t1.demographics_id )
AND ( ( t6.demographics_id = t2.id )
AND ( t5.id = t6.field_id ) ) )
AND ( ( t4.demographics_id = t2.id )
AND ( t3.id = t4.field_id ) ) )
AND ( ( t8.demographics_id = t2.id )
AND ( t7.id = t8.field_id ) ) )
AND ( t0.id = t1.metadata_id ) ) )
ORDER BY t0.bbradstatusupdated DESC
Here is an example of the table structure:
What I am trying to achieve is to select a Study where the value of a field contains 'james' and the xml tag would be '1' ( 1 is name, 2 is gender ). The reason I use a map is because you might also want to search on additional fields, for example gender. I think the only way for this to work would be with exists sub-queries or joins as I've done above.
Unfortunately this data structure is pretty bad but was inherited from a previous database design, so I'm kind of stuck with it.
If I understand correctly what you are doing, then you are adding to joins to the query for every map entry. This means, that you multiply the amount of data the query has to process by a huge number for each map element. This is bound to yield bad scalability.
To verify if this is really the problem, do the following:
Run the (sql) statements for 2 and three map entries against the database and measure their performance. Make sure you fetch all entries an not just the first.
If my guess is right, you will see more or less the same performance drop, so this is not a JPA/eclipse-link problem.
The solution might be to switch to a different datamodel, possibly a star schema. But this will affect a lot of thing in your application and we know to little about that.