How can I use the treetagger
in a python-script?
I have a sentence given, and the treetagger
should analyze it. In a normal
command line, I can do the following:
echo 'This is a test!' | cmd/tree-tagger-english-utf8
but how can I do this in a python script?
The output of the command above is the following:
echo 'This is a test!' | cmd/tree-tagger-english
reading parameters ...
tagging ...
finished.
This DT this
is VBZ be
a DT a
test NN test
! SENT !
In my script, I need the tags, i.e. "DT", "VBZ", "DT", "NN", "SENT" which I'd like to save in a list. I need these tags later to insert them in a string.
Thanks for any help! :)
Look at the subprocess module: a simple example follows...
$ cat test.py
#!/usr/bin/python
import os
import sys
import subprocess
list_of_lists = []
process = subprocess.Popen(["cmd/tree-tagger-english-utf8"], stdout=subprocess.PIPE)
(output, err) = process.communicate(sys.stdin)
count = 0
for line in output.split('\n'):
# condition to skip the first 3 lines
if count<3:
count=count+1
else:
new_list = [elem for elem in line.split()]
list_of_lists.append(new_list)
exit_code = process.wait()
print list_of_lists
$