Search code examples
javaluceneindexingsearch-engine

How to index a String in Lucene?


I'm using Lucene to index strings which I read from document. I'm not using reader class, since I need to index string to different fields.

document.add(new Field("FIELD1","string1", Field.Store.YES, Field.Index.UNTOKENIZED));
document.add(new Field("FIELD2","string2", Field.Store.YES, Field.Index.UNTOKENIZED));

This works in building the index but searching

QueryParser queryParser = new QueryParser("FIELD1", new StandardAnalyzer());
Query query = queryParser.parse(searchString);
Hits hits = indexSearcher.search(query);
System.out.println("Number of hits: " + hits.length());

doesn't returns any result.

But when I index a sentence like,

document.add(new Field("FIELD1","This is sentence to be indexed", Field.Store.YES, Field.Index.TOKENIZED));

searching works fine.

Thanks.


Solution

  • You need to set the parameter for the fields with the words also to Field.Index.TOKENIZED because searching is only possible when you tokenize. The word "string1" will be indexed as "string1". Without tokenization it won't be indexed at all.

    Use this:

    document.add(new Field("FIELD1","string1", Field.Store.YES, Field.Index.TOKENIZED));
    document.add(new Field("FIELD2","string2", Field.Store.YES, Field.Index.TOKENIZED));
    

    When you want to index a string containing multiple words, e.g. "two words" as one searchable element without tokenizing into 2 words, you either need to use the KeywordAnalyzer during indexing which takes the whole string as a token or you can use the StringField object in newer versions of Lucene.