Search code examples
subprocessstanford-nlp

run stanford parser interactively (using stdin and stdout) or run it as a server


I found it inefficient to reboot the parser when new input comes, so I'd like to run the parser interactively--read the input from stdin and print result to stdout. However, the instruction given on the official website Can I have the parser run as a filter? seems not compatible with options (for example, -port).

I know that CoreNLP can be run as a server but it can not receive POS tagged text as input so I won't use it.

Here is what I'm trying:

class myThread(threading.Thread):
def __init__(self,inQueue,outQueue):
    threading.Thread.__init__(self)

    self.cmd=['java.exe',
              '-mx4g',
              '-cp','*',
              'edu.stanford.nlp.parser.lexparser.LexicalizedParser',
              '-model', 'edu/stanford/nlp/models/lexparser/chinesePCFG.ser.gz',
              '-sentences', 'newline',
              '-outputFormat', 'conll2007', 
              '-tokenized',
              '-tagSeparator','/',
              '-tokenizerFactory', 'edu.stanford.nlp.process.WhitespaceTokenizer',
              '-tokenizerMethod', 'newCoreLabelTokenizerFactory',
              '-encoding', 'utf8']
    self.subp=subprocess.Popen(cmd,stdin=subprocess.PIPE,stdout=subprocess.PIPE,stderr=subprocess.PIPE)
    self.inQueue=inQueue
    self.outQueue=outQueue
def run(self):
    while True:
        rid,sentence=self.inQueue.get()
        print(u"Receive sentence %s"%sentence)
        sentence=sentence.replace("\n","")
        self.subp.stdin.write((sentence+u'\n').encode('utf8'))
        self.subp.stdin.flush()
        print("start readline")
        result=self.subp.stdout.readline()
        print("end readline")
        print(result)
        self.outQueue.put((rid,result))

Solution

  • I think you're confusing things a bit. Both CoreNLP and Stanford Parser have an option to run as a command-line filter, reading from stdin and writing to stdout. However, only CoreNLP separately provides a webservice implementation.

    Options like port only make sense for the latter.

    So, at the moment, I agree that you have a valid use case (wanting to input pre-tagged text) but at present there isn't webservice support for it. The easiest path forward would be to write a simple webservice implementation for the parser. For us, it could happen sometime, but there are a bunch of other current priorities. Anyone else is welcome to write one. :)