I am trying to filter the words in a text file. If there are any 'comparative' and 'superlative' words in the file, I want to convert them to 'positive'.
e.g. - 'greatest' -> 'great' and so on.
I am using 'pattern' module for this. In example it says,
from pattern.en import comparative, superlative
print comparative('bad')
gives -> worse
works fine.
but, If I do:
from pattern.en import comparative, superlative, positive
print positive('worse')
It gives, 'False'
Am I doing it wrong ? Is there any way to find out 'comparative' and 'superlative' words and print the positive word of them ?
This is a misunderstanding: the positive()
function doesn't do what you think.
As far as I can see, the pattern.en
module only provides functions for generating comparatives and superlatives from the positive form of an adjective, but not for the inverse (analysing the forms as comparative/superlative of a positive form).
There is a lemma()
function, which you could expect to do this, but unfortunately it only works for verbs.
The positive()
function you found belongs to sentiment detection; it tries to tell if a given sentence has a positive polarity.
So, what do you do now?
I see two possibilities: You either switch to a different library which supports lemmatisation of adjectives (eg. SpaCy), or you try to build a simple adjective lemmatiser based on the code from the pattern.en
module.
If you go for the second option, have a look at the last 80 lines of code in the inflect
module. I suggest you first try to catch the irregular cases (using an inversion of the table given there), then you strip off the -er
/-est
suffix. There's probably a number of special cases (like i
→ y
in heavier
→ heavy
).
Try something yourself, and if you run into problems come back here with a new question!