Search code examples
pythonregextext-mining

text mining python keys


I have a multiline file, tab separated, which might include (or not) some keywords int the second column,

Place1______________fish

Place2______________fishing someting

Placexx_____________something missing

Place_somwhere______something else missing

EHDN_______________fishing something

HDGFE______________looking for something

(the lines are uggly but i couldn't manage to make the data look like a table)

I would need to, each time that the line contains 'something missing', to add an annotation at the end of the line, like "ACTION NEEDED THERE";

I've tried someting like:

pattern="something missing"
OUT=open('/Users/user/output.tab','w')

for line in file:
  field=line.split('\t')
  if pattern in field[1]:
    action = ';'.join("ACTION NEEDED")
    OUT.write(action.strip().replace('"',' '))

or findall re function without success...

Can you help me please ? Should re.findall work here ? I've tried pattern=re.findall("something missing", line) but it's not working.... Sorry for asking that but i did not manage to find the right answer in the previous posts..... Many Thanks in advance !


Solution

  • Change this,

    if pattern in field[1]:
    

    to

    if any([True for word in pattern.split() if word in line]):
    

    You can add the annotation by,

    line+" "+your_annotation