Is there a way to parse the PTB tree below to get all the child trees for example:
Text : Today is a nice day.
PTB : (3 (2 Today) (3 (3 (2 is) (3 (2 a) (3 (3 nice) (2 day)))) (2 .)))
Need All child trees possible
Output :
(3 (2 Today) (3 (3 (2 is) (3 (2 a) (3 (3 nice) (2 day)))) (2 .)))
(2 Today)
(3 (3 (2 is) (3 (2 a) (3 (3 nice) (2 day)))) (2 .))
(3 (2 is) (3 (2 a) (3 (3 nice) (2 day))))
(3 (2 is) (3 (2 a) (3 (3 nice) (2 day))))
(2 is)
(3 (2 a) (3 (3 nice) (2 day)))
(2 a)
(3 (3 nice) (2 day))
(3 nice)
(2 day)
(2 .)
The input file for this demo should be one string representation of a tree per line. This example prints out the subtrees of the first tree.
The Stanford CoreNLP class of interest is Tree.
import edu.stanford.nlp.trees.*;
import java.io.BufferedReader;
import java.io.FileInputStream;
import java.io.InputStreamReader;
import java.io.*;
public class TreeLoadExample {
public static void printSubTrees(Tree t) {
if (t.isLeaf())
return;
System.out.println(t);
for (Tree subTree : t.children()) {
printSubTrees(subTree);
}
}
public static void main(String[] args) throws IOException, FileNotFoundException,
UnsupportedEncodingException {
TreeFactory tf = new LabeledScoredTreeFactory();
Reader r = new BufferedReader(new InputStreamReader(new FileInputStream(args[0]), "UTF-8"));
TreeReader tr = new PennTreeReader(r, tf);
Tree t = tr.readTree();
printSubTrees(t);
}
}