Search code examples
pythonlistnltkpunctuation

Remove selected punctuation from list of sentences


I have a list of sentences like :
[' no , 2nd main 4th a cross, uas layout, near ganesha temple/ bsnl exchange, sanjaynagar, bangalore',
' grihalakshmi apartments flat , southend road basavangudi bangalore -560004. opp adiyar ananda bhavan near south end c',
' srinivas pg acomudation ;opp to cosmos mall brooke field',
' royal palms 2nd cross,l b sastry nagar bangalore',
' bmp ho name grija \krishnappa garden bagamane .technologi park cv ramanagar']

i need to remove all punctuation except for , and / .. i used string.punctuation to remove all

def punc(x):
    predicate = lambda y:y not in string.punctuation
    out = filter(predicate,x)
    return out
data = data.apply(punc)

this removed everything.. want to remove selected ones..plz help

i used .apply() for it when it was part of a dataframe. now i've converted it to a list. so please reccommend a technique to deal with exceptional punctuations in a list object.


Solution

  • Try this

    def punc(x):
        predicate = lambda y:y not in ''.join(c for c in string.punctuation if c not in ',/')
        out = filter(predicate,x)
        return out
    

    Besides, apply() is deprecated.

    Remove when prepositions are in the string.

    def punc(x):
        predicate = lambda y:y not in ''.join(c for c in string.punctuation if c not in './')
        prepositions = ['a', 'in']  #define by yourself
        if any(p in x.split() for p in prepositions):
            return filter(predicate,x)
        return x