Search code examples
stanford-nlp

What treebank was used to train the Stanford CoreNLP Spanish constituency parser?


I've searched the docs and the FAQs but I have yet to find the answer. Was the IULA treebank from the Pompeu Fabra Uni used? https://www.iula.upf.edu/recurs01_tbk_uk.htm

Thanks.


Solution

  • The parser was trained on a preprocessed version of the AnCora Spanish 3.0 corpus.

    You can find more information about the training data and the preprocessing at

    http://nlp.stanford.edu/software/spanish-faq.html .