I have this python function that works as expected. Is it possible to save the logic as NLP stemmer? If yes, what changes needs to be done?
import itertools, re
def dropdup(mytuple):
newtup=list()
for i in mytuple:
i = i[:-3] if i.endswith('bai') else i
for r in (("tha", "ta"), ("i", "e")):
i = i.replace(*r)
i = re.sub(r'(\w)\1+',r'\1', i)
newtup.append(''.join(i for i, _ in itertools.groupby(i)))
return tuple(newtup)
dropdup(('savithabai', 'samiiir', 'aaaabaa'))
('saveta', 'samer', 'aba')
I will like the users to import something like this...
from nltk.stemmer import indianNameStemmer
There are a few more rules to be added to the logic. I just want to know if this is a valid (pythonic) idea.
First see https://www.python-course.eu/python3_inheritance.php
Create a file mytools.py
import itertools, re
from nltk.stem import StemmerI
class MyStemmer(StemmerI):
def stem(self, token):
itoken = token[:-3] if token.endswith('bai') else token
for r in (("tha", "ta"), ("i", "e")):
token = token.replace(*r)
token = re.sub(r'(\w)\1+',r'\1', token)
return ''.join(i for i, _ in itertools.groupby(token))
Usage:
>>> from mystemmer import MyStemmer
>>> s = MyStemmer()
>>> s.stem('savithabai')
'savetabae'