Search code examples
javahibernateluceneaclhibernate-search

How to combine Hibernate Search (Lucene) with paging and ACLs


I am using Spring Security with ACLs to secure the documents in my application. On the other hand I use Hibernate Search (on top of lucene) to search for the documents. This search also support paging. (Documents are only meta data of documents stored in a Database.)

FullTextEntityManager fullTextEntityManager = Search.getFullTextEntityManager(entityManager);
QueryBuilder queryBuilder = fullTextEntityManager.getSearchFactory().buildQueryBuilder().forEntity(Document.class).get();
Query query = queryBuilder.keyword().onFields(fieldNames.toArray(new String[0])).matching(searchQuery)
            .createQuery();

FullTextQuery fullTextQuery = fullTextEntityManager.createFullTextQuery(query, Document.class);
fullTextQuery.setFirstResult(pageable.getFirstItem());
fullTextQuery.setMaxResults(pageable.getPageSize());

Now I have to combine the paging with the ACLs. The only idea I have at the moment, is to remove the paging form the FullTextQuery, read all search result documents, filter them by there ACLs and then do the paging by hand. But I don't like that solution, because it loads all the documents, instead of only the one for the page.

Does anybody have a better idea?


Solution

  • I have hit the same problem too and I don't think there is a simple answer.

    I think there are only two solutions. The one you have suggested which has performance problems you've described as you have to load the documents and resolve the ACL for each result and then do your own paging. The alternative is to push this work to the indexing side and index your ACL in Lucene. This gives you the search performance, hiding the results which a user can't see by adding filter terms based on the current user/group/permissions/roles but at the expense of maintaining the index with ACL information. If your ACL is simple then this may be an option. If your ACL is hierarchical then it's still an option but more complicated. Its also tricky to keep your index upto date with the ACL.

    The fact that you are starting to look into this sort of functionality may indicate that you are beginning to stretch your Database/Hibernate/Lucene solution. Maybe a content repository like Jackrabbit may be a better fit? I guess this is probably a step too far but it may be worth taking a look to see how it does it. Alternatively take a look at SOLR, particularly this issue which describes what a thorny problem it is.