Search code examples
javasearchlucenespecial-charactershibernate-search

Hibernate Search Lucene query parser with Special Characters


FIRST QUESTION: Can somebody explain to me how the lucene query in Hibernate Search handles special characters. I read the documentation about Hibernate search and also the Lucene Regexp Syntax but somehow they don't add up with the generated Queries and Results.

Lets assume i have following database entries:

name
Will
Will Smith
Will - Smith
Will-Smith

and i am using following Query:

Query query = queryBuilder  
    .keyword()           
    .onField("firstName")
    .matching(input)
    .createQuery();     

Now I am looking for the following input:

Will -> returns all 4 entries, with the following generated query: FullTextQueryImpl(firstName:will)

Will Smith -> also returns all 4 entries with the following generated query: FullTextQueryImpl(firstName:will firstName:smith)

Will - Smith -> also returns all 4 entries with the following generated query: FullTextQueryImpl(firstName:will firstName:smith) ? Where is the "-" or shouldn't it forbid everything after the "-" according to Lucene Query Syntax?

Will-Smith -> same here

Will-Smith -> here i tried to use backslash but same result

Will -Smith -> Same here

SECOND QUESTION: Lets assume i have following database entries in which the entry without numerical ending always exists and the ones with numerical ending could be in the datebase. How woul a lucene query for this look like?

name
Will
Will1
Will2

Solution

  • You can play around with Lucene analyzers and see what happens behind the scenes. Here is a tutorial: https://www.baeldung.com/lucene-analyzers

    The tokenizer is pluggable, so you can change how special characters are treated.