the following request to 'http://corenlp.run' assigns the label 'dep' to all dependencies. Can someone explain this behavior? Looks like an issue to me or could this be some limitation (rate-limit) from the public endpoint? However, the web-interface returns the correct response.
wget --post-data "Having earned a doctorate as a physical chemist, Merkel entered politics in the wake of the Revolutions of 1989, briefly serving as a deputy spokesperson for the first democratically-elected East German Government in 1990. Following German reunification in 1990, Merkel was elected to the Bundestag for Stralsund-Nordvorpommern-Rügen in the state of Mecklenburg-Vorpommern, a seat she has held ever since. Merkel was later appointed as the Minister for Women and Youth in 1991 under Chancellor Helmut Kohl, later becoming the Minister for the Environment in 1994. After Kohl was defeated in 1998, Merkel was elected Secretary-General of the CDU before becoming the party's first woman leader two years later in the aftermath of a donations scandal that toppled Wolfgang Schäuble." 'http://corenlp.run/?properties={"tokenize.whitespace": "true", "annotators": "tokenize,ssplit,pos,lemma,ner,parse, depparse,mention,coref", "outputFormat": "json",'timeout': 30000}' -O -
For other inputs, the parse attribute response looks quite strange. The Web-interface answer is correct again. Example for a wrong parse response:
"parse":"(X ... (X their) (X stomachs) (X while) (X simultaneously) (X appealing) (X to) (X their) (X vanity.) (X The) ...)"
I tried the public endpoint, because the latest compiled release suffers from this issue and the Github codebase build instructions seems outdated. I totally miss a guide that describes how to build the *.jars provided in their fate bundle here from the Github repo .
UPDATE:
Just tried the same request with a local instance and the latest CoreNLP Server. Same issue. Only the web-interface returns the correct reponse. If I remove the parse annotator it works. However, I need both annotations.
Chances are, you're hitting the server's default parse.maxlen
limit of 60. You can override it by explicitly setting the property parse.maxlen=<number_of_tokens>
in the properties passed to the server. But, beware: sentences longer than this are liable to take a very long time to parse.
If you only need dependencies, I recommend using the depparse
annotator instead. This is what the demo at corenlp.run uses, and why it works on longer sentences.