Search code examples
nlpstanford-nlpsyntaxnet

How to process tree that i got from syntaxnet?(conll format)


I guess that i need Semgrex from edu.stanford.nlp package. For this task i need to construct Tree from edu.stanford.nlp.trees.Tree and process that tree like

import edu.stanford.nlp.semgraph.semgrex.SemgrexMatcher;
import edu.stanford.nlp.trees.Tree;
import edu.stanford.nlp.semgraph.SemanticGraphFactory;

public class SemgrexDemo  {
    public static void main(String[] args) {
        Tree someHowBuiltTree;//idnt know how to construct Tree from conll
        SemanticGraph graph = SemanticGraphFactory.generateUncollapsedDependencies(someHowBuiltTree);
        SemgrexPattern semgrex = SemgrexPattern.compile("{}=A <<nsubj {}=B");
        SemgrexMatcher matcher = semgrex.matcher(graph);
    }
}

Actually i need some suggestions about how to constract tree from conll.


Solution

  • You want to load a SemanticGraph from your CoNLL file.

    import edu.stanford.nlp.trees.ud.ConLLUDocumentReader;
    ...
    
    CoNLLUDocumentReader reader = new CoNLLUDocumentReader();
    Iterator<SemanticGraph> it = reader.getIterator(IOUtils.readerFromString(conlluFile));
    

    This will produce an Iterator that will give you a SemanticGraph for each sentence in your file.

    It is an open research problem to generate a constituency tree from a dependency parse, so there is no way in Stanford CoreNLP to do that at this time to the best of my knowledge.