Search code examples
javastanford-nlp

French dependency parsing using CoreNLP


I am following the example in this link. I have downloaded the french jar from here. When I call it as follows,

java -mx1g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLP -props StanfordCoreNLP-french.properties -annotators tokenize,ssplit,pos,depparse -file french.txt -outputFormat conllu

I always see it loads a english dep-parser model instead of french.

Loading depparse model file: edu/stanford/nlp/models/parser/nndep/english_UD.gz ... PreComputed 100000, Elapsed Time: 1.341 (s)

Is this a bug?


Solution

  • Update -- I found that the default properties file does not specify a depparse model. So now I give it my own config file and now it works.

    annotators = tokenize, ssplit, pos, depparse, parse
    
    tokenize.language = fr
    
    pos.model = edu/stanford/nlp/models/pos-tagger/french/french.tagger
    
    parse.model = edu/stanford/nlp/models/lexparser/frenchFactored.ser.gz
    
    depparse.model = edu/stanford/nlp/models/parser/nndep/UD_French.gz