Search code examples
javaspring-bootlucenehibernate-search

field with 'indexNullAs' not included in result set


I am trying to make use of @IndexedEmbedded's indexNullAs property on a property which is null in the database:

...
@Indexed
public class DocVersion implements Serializable {
...
    @IndexedEmbedded(indexNullAs = "valid")
    @OneToOne(cascade = CascadeType.ALL, orphanRemoval = true)
    @JoinColumn(name = "rule_id")
    private ValidityRule validityRule;
...
}

in the db I have two records, one that have validityRule associated with the record and the other one that is null.

this is the part of the query I have constructed for hibernate search:

FullTextQueryImpl(-(+document2.docType.documentType:FOLDER) +(+document2.eDocState:ACTIVE) +(document2.docType.documentType:ACCOUNTSTATEMENT) +(+validityRule.validFrom:[* TO 0000020190601] +validityRule.validTo:[0000020190701 TO *] validityRule:valid))

this returns just one record, the one that is associated with the validityRule and conforms to the Date criteria.

My problem is that it won't return the second row that doesn't have set a validityRule. I understand this indexNullAs as it should become 'valid' string when db field is not set. and you can see it in the query?

So why I don't get the second row in the result set? what am I doing wrong? I have deleted all the indexes from the dist and program is reindexing records on every run.

hibernate-search-orm: 5.10.4.Final

Thanks for explanation.


Solution

  • The problem is with this part of the query:

    +validityRule.validFrom:[* TO 0000020190601] +validityRule.validTo:[0000020190701 TO *] validityRule:valid
    

    What you are saying here is:

    • validityRule.validFrom MUST be below 0000020190601
    • validityRule.validTo MUST be above 0000020190701
    • optionally, validityRule SHOULD be equal to valid, and in this case the score of the document should be slightly higher.

    This roughly translates to the following boolean expression:

    (
        validityRule.validFrom:[* TO 0000020190601]
        AND validityRule.validTo:[0000020190701 TO *]
    )
    

    The "SHOULD" clause doesn't appear in it, because it is optional. It does not affect matching, only the score.

    What's important here is that "MUST"/"SHOULD" are not directly equivalent to the boolean operators "AND"/OR": they have their own meaning. As a rule of thumb, if you want to implement "AND"/"OR" in a full-text query:

    • For "AND", use a boolean query with only "MUST" clauses.
    • For "OR", use a boolean query with only "SHOULD" clauses.
    • Nest boolean queries as necessary to combine "AND" and "OR". Think of the boolean query as the parentheses in a traditional boolean expression: they prevent "AND" operators to mess with "OR" operators.
    • Never mix "MUST" and "SHOULD" in the same boolean query unless you know what you are doing.

    In your case, you will need two levels of boolean queries:

    ( // parent boolean query implementing "OR"
      ( // First "SHOULD" clause: a boolean query implementing "AND"
        +validityRule.validFrom:[* TO 0000020190601] // "MUST" clause
        +validityRule.validTo:[0000020190701 TO *] // "MUST" clause
      )
      validityRule:valid // Second "SHOULD" clause
    )