Search code examples
javaluceneluke

Luke + lucene 5.4.1


I am using custom analyzer for building an index file using Lucene 5.4.1 version, and i am trying to use Luke for finding data in index file. i am trying to add my custom analyzer with Luke, but i don't find that in analyzers tab.

i am using below syntax for adding my analyzer to Luke java -cp "pivot-luke-with-deps.jar;CatalogSearchAnalyzer.jar" org.getopt.luke.Luke

my analyzer code `

public class CatalogSearchAnalyzer extends Analyzer {
private Version matchVersion;
private String termValue;
private boolean retMultiple;
public static final String[] STOP_WORDS = { "a", "and", "are", "as", "at",
        "be", "but", "by", "for", "if", "in", "into", "is", "it", "no",
        "not", "of", "on", "or", "such", "t", "that", "the", "their",
        "then", "there", "these", "they", "this", "to", "was", "will",
        "with" };
private CharArraySet stopTable;
private int maxTokenLength;

public CatalogSearchAnalyzer(Version matchVersion) {
    this.stopTable = StopFilter.makeStopSet(STOP_WORDS);
    this.maxTokenLength = 255;

    this.matchVersion = matchVersion;
}

public CatalogSearchAnalyzer() {
    this(STOP_WORDS);
}

public void setTermValue(String termValue) {
}

public void setRetMultiple(boolean retMultiple) {
}

public CatalogSearchAnalyzer(String[] stopWords) {
    this.stopTable = StopFilter.makeStopSet(STOP_WORDS);
    this.maxTokenLength = 255;

    StopFilter.makeStopSet(stopWords);
}

private TokenStream getStemmingFilter(TokenStream result) {
    PorterStemFilter temp = new PorterStemFilter(result);
    temp.setRetMultiple(this.retMultiple);
    return temp;
}

protected Analyzer.TokenStreamComponents createComponents(String fieldName)          {
    StandardTokenizer st = new StandardTokenizer();
    st.setMaxTokenLength(this.maxTokenLength);
    Tokenizer tk = st;
    TokenStream ts = new StandardFilter(tk);
    ts = new LowerCaseFilter(ts);
    ts = new StopFilter(ts, this.stopTable);
    ts = getStemmingFilter(ts);
    return new Analyzer.TokenStreamComponents(tk, ts) {
        protected void setReader(Reader reader) {
            int m = CatalogSearchAnalyzer.this.maxTokenLength;
            if (this.source instanceof CmgtTokenizer) {
                ((CmgtTokenizer) this.source).setMaxTokenLength(m);
            }
            super.setReader(reader);
        }
    };
}
}

` I am not getting any exception while adding my jar to Luke.

Thanks in advance for looking into this.


Solution

  • As stated in the comments section under the question, the solution is to use original thinlet based version of luke instead of pivot based luke. The pivot based luke is work in progress and does not support all features yet (though more testing is encouraged!)

    Thinlet luke on master (currently): https://github.com/DmitryKey/luke

    Pivot luke on branch: https://github.com/DmitryKey/luke/tree/pivot-luke