Search code examples
pythonnlpstanford-nlpnamed-entity-recognition

How do I host CoreNLP server with caseless models?


I'm trying to host a CoreNLP server but with the caseless models but I don't think I was successful and the official site doesn't have example hosting such model.

I'm currently hosting with:

java -mx4g \
           -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer \
           -port 9000 \
           -timeout 15000

but this is the default way of hosting which doesn't use the caseless models. I checked the app log and it was loading the standard models instead of caseless models:

[pool-1-thread-1] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [0.9 sec].
[pool-1-thread-1] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [0.5 sec].
[pool-1-thread-1] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [0.5 sec].

According to https://stanfordnlp.github.io/CoreNLP/caseless.html, I have downloaded the english models jar file and put it under the corenlp module folder, but I don't know exactly how to specify and use those for server hosting.

In the client side, I'm doing the following:

import requests

r = requests.post('http://[::]:9000/?properties={"annotators":"tokenize,ssplit,truecase,pos,ner","outputFormat":"json"}', 
                  data="show me hotels in toronto for next weekend")
print(r.text)

The truecase is working, but I don't see the caseless models being used.

Any help would be appreciated.


Solution

  • You need to pass the property "ner.model": "edu/stanford/nlp/models/ner/english.all.3class.caseless.distsim.crf.ser.gz,edu/stanford/nlp/models/ner/english.muc.7class.caseless.distsim.crf.ser.gz,edu/stanford/nlp/models/ner/english.conll.4class.caseless.distsim.crf.ser.gz"

    Also you may want to use Stanza for accessing the Stanford CoreNLP server.

    Details here: https://stanfordnlp.github.io/stanza/corenlp_client.html#overview