What is the point of the Porter Stemmer algorithm having a rule the converts SS
to SS
?
Imagine the rule SS->SS
was not in the algorithm. Then words like caress
would not be recognized at all and it would seem that algorithm can't do anything to reduce it to a stem. However, with the rule SS->SS
the stemmer says: "I recognize the word caress
and I reduce it to caress
. I'm done". The alternative would be: "I can't do anything". Of course it is fictitious work but what matters since is that it increases the precision of the stemmer. You can see that when the testing of the algorithm is being done. If this rule was not in the stemmer the results would have been different (worse). Look at the word list [ridiculousness, caress]
Case 1.
Rule SS->SS
in the algorithm.
Stemming:
caress (Step 1a)-> caress OK
ridiculousness (Step 2)-> ridiculous (step 4) -> ridicul OK
Success rate: 100%
Case 2.
Rule SS->SS
not in the algorithm.
Stemming:
caress -> fail OK
ridiculousness (Step 2)-> ridiculous (step 4) -> ridicul OK
Success rate: 50%
From practical point of view this rule doesn't matter. It's just a formalism.