I have 2 versions of solr working in my machine . say SolrVer1
and SolrVer2
SolrVer1
have applied , below stemming methods on field type text_en_splitting
<filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt" ignoreCase="true"/>
<filter class="solr.PorterStemFilterFactory" ignoreCase="true"/>
SolrVer2
have applied , below stemming methods on field type text_en_splitting
<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
it works almost same for regular search , but while using wild card search then wild card search does not giving results with grammatical on SolrVer1
like searching with ray*
, SolrVer1
returns very less data as compared to SolrVer2
. when i observed the results then i found that SolrVer1
does not return data with only ray
and rays
.
I don't know where i should use SnowballPorterFilterFactory
and where i should use PorterStemFilterFactory
. and what are the pros and cons of them?
Can anybody have idea on this behavior ??
Thanks
Need to know what the stemmers output for ray
, rays
.
Try stemming them at the Porter stemmer online tool: http://qaa.ath.cx/porter_js_demo.html. It outputs rai
! That's the reason you don't get any matches for ray*
with Porter stemmer.
And here is a tool for snowball stemmer: http://snowball.tartarus.org/demo.php.
This outputs ray
for ray
and rays
which is why you get the results.
You may want to read this for comparing the two stemmers: http://snowball.tartarus.org/texts/introduction.html
Appears like snowball was designed to address such short-comings of Porter.