Search code examples
lucenelexical-analysisluke

Lucene StandardAnalyzer not converting to lowercase when writing index


I am using Lucene 8.3 and experience an unexpected behavior of the StandardAnalyzer. In order to isolate the problem I managed to produce the same behavior with Luke. Here's the description for Luke:

I am creating a new document, using StandardAnalyzer. One single field: name=myField; type=StringField; options=store-value; value='Foo'

Then switch to search: parsing the term 'myField:Foo' shows that it is converted into lowercase. This is the expected behavior of StandardAnalyzer. Searching, however, yields zero results. Switching to WhitespaceAnalyzer will parse the same term preserving case. Searching will deliver the one new document that I just entered.

To me it looks like that StandardAnalyzer does not convert the text to lowercase during document creation / index writing. I have the same situation in my Java code.

What am I missing? Are there any other settings that I need to observe?


Solution

  • StringField is not analyzed. Use TextField.

    Common field types are documented here: http://lucene.apache.org/core/8_3_0/core/org/apache/lucene/document/Field.html