I have a live Solr cluster where stemming was not enabled and my schema.xml looks like this:
..
<field name="Searchable_Text" type="text_general" indexed="true" stored="true" multiValued="false"/>
..
<field name="text" type="text_general" indexed="true" stored="false" multiValued="true"/>
..
<copyField source="Searchable_Text" dest="text" maxChars="3000"/>
..
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
..
and these are the steps I took to enable stemming on my live cluster
Changed the schema.xml to including stemming in the index:
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
and I disabled the opensearcher in solrconfig :
<autoCommit>
<maxTime>${solr.autoCommit.maxTime:60000}</maxTime>
<openSearcher>false</openSearcher> <!-- was set to true earlier-->
</autoCommit>
I then reindexed my entire data. My assumption is that the data is committed but since the opensearcher is set to false, the newly indexed data is not visible.
After this, I changed the schema.xml to include stemming in the query and changed solrconfig.xml to set opensearcher as true :
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
</fieldType>
and
<autoCommit>
<maxTime>${solr.autoCommit.maxTime:60000}</maxTime>
<openSearcher>true</openSearcher> <!-- was set to true earlier-->
</autoCommit>
I then reloaded the core. But I still dont see my queries stemmed. A debugQuery check doesnt seem to show stemming in the query. Its quite weird. Is there anything wrong with my approach ?
I am using Solr 4.7
Thanks
Well, there was something stupid which I landed up doing because of which things didn't work. The above steps surely worked for me, except for that when I reloaded the core, I did it using the LB VIP and not each individual machine (!) . Doing that solved my problem.
Anyways, thanks everybody !