Search code examples
hibernate-searchhibernate-search-6

Hibernate Search 6: Case-insensitive searching of aggregable fields while retaining case-sensitive aggregation results


I have used the new aggregation functionality of Hibernate Search 6 to develop a classic "faceted search" interface, in which the various search fields in the UI are accompanied by the most popular choices taken from the aggregation data of the SearchResult. This works beautifully.

However, I would like to allow the users to be able to search these fields case-insensitively, so that they are not limited to choosing from the aggregation results and are not penalised for typing in the wrong case.

I have applied a lowercase normalizer to the aggregable field definition, which allows me to search case-insensitively, but if I do this all of the aggregation data retrieved from the SearchResult, and presented to the user, is also in lowercase.

Is there a way to allow case-insensitive searches while retaining the original case in the aggregation results?

I have attempted to use projectable( Projectable.YES ) in my field definition, in the hope that this would return the original case, but it had no effect.

My current field template definition is:

indexSchemaObjectField.fieldTemplate( "template", f -> f.asString()
    .aggregable( Aggregable.YES )
    .projectable( Projectable.YES )
    .normalizer( "lowercase")
).multiValued();

and my lowercase normalizer is defined as:

luceneAnalysisConfigurationContext.normalizer( "lowercase" )
    .custom()
    .tokenFilter( "lowercase" );

I'm using the Lucene backend.


Solution

  • Ideally you'd use multi-fields, but that's not available at the moment (https://hibernate.atlassian.net/browse/HSEARCH-3465).

    In the meantime, I would rely on two separate fields:

    // Declare one field for aggregations
    // (do this first, so that the glob is matched first)
    indexSchemaObjectField.fieldTemplate( "template", f -> f.asString()
        .aggregable( Aggregable.YES )
    )
    .matchingPathGlob("*_agg")
    .multiValued();
    
    // Declare one field for search
    indexSchemaObjectField.fieldTemplate( "template", f -> f.asString()
        .normalizer( "lowercase")
    ).multiValued();
    

    Then in your bridge, you would duplicate the value: first populate the field "<fieldname>", then the field "<fieldname>_agg" with the same value.

    Finally, when searching, you would use "<fieldname>" for predicates, but "<fieldname>_agg" for aggregations.