Search code examples
javanlpstanford-nlp

How to get Enhanced++ dependency labels with a java command line in the terminal?


I don't really know java, but I was just trying to use the documentation of the Stanford NLP parser to get the Enhanced++ dependency labels. This is the line I ran:

java -cp "*" -Xmx2g edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators "tokenize,ssplit,pos,lemma,depparse" -file input.txt

And indeed I get an output. But I don't get the labels that I expect. For example, in the input.txt file there is a sentence "The older couple is picnicking with wine", and the dependency between picnicking and wine should be nmod, but instead it is obl:with. Another sentence is "What do you call it?", where I expect a dobj relationship between "call" and "it", but instead I get "obj".

What should I fix to get the labels of the enhanced universal dependencies?

(Also, do I really need to specify the options "tokenize,ssplit,pos,lemma" if I am interested only in "depparse"?)

Thank you.


Solution

  • You actually are getting enhanced++ dependency labels. However, it looks like you are looking for something else or an older version.

    UD was somewhat revised between UDv1 and UDv2. One of the changes was to make oblique modifiers (PPs in English) of predicates into the relation obl rather than nmod, restricting nmod to modifiers of nominals. Hence, obl not nmod. And then part of being enhanced dependencies rather than basic dependencies is getting incorporation of the case or preposition in the label, so you get obl:with. Similarly, in UDv2, the label dobj was changed to simply obj.

    (And, yes, you do need to use all the annotators "tokenize,ssplit,pos,lemma", because they are needed preprocessing steps before dependency parsing.)