Search code examples
hibernatelucenefull-text-searchcriteriahibernate-search

Lucene and Criteria Api Join on 2 different objects


I have a OneToOne relationship beetween 2 objects : Ad(contains address_id as FK) and Address.

For Address i'm using fullTextSearch query(lucene) and for Adds criteria query.

I want to search for a keyword on some fields of Address, get the Ad linked to resulted records(Addresses) from lucene search and then to use the filters from Criteria query to reduce the number of results.

Lucene:

FullTextEntityManager fullTextEntityManager = Search.getFullTextEntityManager(entityManager);

    QueryBuilder queryBuilder = fullTextEntityManager.getSearchFactory().buildQueryBuilder().forEntity(Address.class).get();

    Query luceneQuery = queryBuilder
            .keyword()
            .wildcard()
            .onFields(Address_.__REGION, Address_.__TOWN, Address_.__STREET)
            .matching(input.toLowerCase() + "*")
            .createQuery();

    return fullTextEntityManager.createFullTextQuery(luceneQuery, Address.class);

Criteria API:

CriteriaBuilder criteriaBuilder = entityManager.getCriteriaBuilder();
    CriteriaQuery<Ad> criteria = criteriaBuilder.createQuery(Ad.class);

    Root<Ad> AdRoot = criteria.from(Ad.class);

    criteria.select(AdRoot);

    if (AddressID != null) {
        whereConditions = (whereConditions == null) ? criteriaBuilder.equal(AdRoot.get(Ad_.address), AddressID)
                : criteriaBuilder.and(whereConditions, criteriaBuilder.equal(AdRoot.get(Ad_.address), AddressID));
    }

    if (PriceFrom != null && PriceTo != null)
        whereConditions = (whereConditions == null) ? criteriaBuilder.between(AdRoot.get(Ad_._Price), PriceFrom, PriceTo)
                : criteriaBuilder.and(whereConditions, criteriaBuilder.between(AdRoot.get(Ad_._Price), PriceFrom, PriceTo));

    if (SizeFrom != null && SizeTo != null)
        whereConditions = (whereConditions == null) ? criteriaBuilder.between(AdRoot.get(Ad_._Size), SizeFrom, SizeTo)
                : criteriaBuilder.and(whereConditions, criteriaBuilder.between(AdRoot.get(Ad_._Size), SizeFrom, SizeTo));

    if (whereConditions==null){
        return null;
    }

    criteria.where(whereConditions);

    return entityManager.createQuery(criteria).setMaxResults(MAX_NR_SESULTS_PER_PAGE).getResultList();

Solution

  • Technically you didn't really ask a question, so I'm going to give you some advice.

    Know that mixing Criteria queries and full-text queries is really hard to do when you want to use paging, which you usually do when you have a large number of results.

    Really, if you have to apply predicates to two different entities, the easiest path by far would be to use Hibernate Search's @IndexedEmbedded annotation:

    • add @Indexed to your Ad entity, and remove it from your Address entity (keep the @Fields on your Address entity though)
    • add @IndexedEmbedded to the address field of your Ad entity
      • if your Address entity may change over time, you should also add a reverse side to the association (an Ad ad field in your Address entity) and annotate it with @ContainedIn so that the index for Ad entities gets updated when addresses are updated.
    • add @Field annotations to the price and size fields of your Ad entity
    • build your query entirely using Hibernate Search, targeting the Ad entity.
      • copy the code you had in your existing full-text query, except you will retrieve a query builder for the Ad entity instead of the Address entity, and you will target the address.region, address.town, address.street fields.
      • create additional lucene queries for the price and size fields, using range queries
      • combine all the lucene queries into one using a boolean junction

    You won't have the full power of SQL WHERE clauses and joins, but in a lot of cases it's enough, and in your cases it seems to be enough.

    Why it's hard to combine results from different queries when doing paging

    If you want details about why this is hard... Warning: it's a bit complex so I'm not sure I can explain this clearly.

    Paging is usually performed by the query engine, outside of your application and even outside of Hibernate. But you will combine the results of two queries after those queries happened, in your application, so you're applying another level of filtering after paging. This means the first results that have been skipped by the query engine might not be matching results after all.

    So if you skipped 100 results for instance, maybe only 50 of those skipped results were actual results in your query. So you actually retrieved results starting from the 51th result, whereas you asked results starting from the 101th...

    There are ways around that, one possibility being to switch from a "paging by index" strategy to a "paging by range on a unique sorting key" strategy.

    Essentially instead of asking for the "first 10 results", then "results 11 to 20", and so forth, you will for example sort by creation date, ask for the "first 10 results", then the "10 first results with a creation date higher than foo" (foo being the creation date of the 10th result you received in the previous page).

    This strategy has a few downsides, most notably:

    • Your users won't be able to jump to an arbitrary page (except the first one), but only go the next page, or the previous one with a bit more work.
    • You have to sort on a unique key. If you don't have such a key, or if you want to sort by full-text query score, then it won't work.

    However, this strategy is efficient and the only one I know that may produce correct results.