In documentation I will see call java class with params:
java edu.stanford.nlp.parser.nndep.DependencyParser -tlp edu.stanford.nlp.trees.international.pennchinese.ChineseTreebankLanguagePack -trainFile chinese/train.conll -devFile chinese/dev.conll -embedFile chinese/embeddings.txt -embeddingSize 50 -model nndep.chinese.model.txt.gz
Where I can find specification on this 3 files?
chinese/train.conll - this is train file (specification on it http://ilk.uvt.nl/conll/#dataformat)
chinese/dev.conll - what is it?
chinese/embeddings.txt - what is it?
chinese/train.conll, chinese/dev.conll: These are training/dev files in CoNLL 2006 format, as discussed in section 4.1 of the paper: http://cs.stanford.edu/~danqi/papers/emnlp2014.pdf . (In general we don't have permission to distribute data sets to others.)
chinese/embeddings.txt: These are word embeddings trained with word2vec as described in section 3.2 of the same paper.