I have StanfordNLP up and running.
My maven dependency structure is as follows:
<dependency>
<groupId>edu.stanford.nlp</groupId>
<artifactId>stanford-corenlp</artifactId>
<version>3.6.0</version>
</dependency>
<dependency>
<groupId>edu.stanford.nlp</groupId>
<artifactId>stanford-corenlp</artifactId>
<version>3.6.0</version>
<classifier>models</classifier>
</dependency>
My code runs just fine as follows:
@Test
public void testTA() throws Exception
{
Path p = Paths.get("s.txt");
byte[] encoded = Files.readAllBytes(p);
String s = new String(encoded);
Properties props = new Properties();
props.setProperty("annotators", "tokenize, ssplit, pos, lemma, parse, ner, dcoref");
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
// read some text in the text variable
String text = s;
StringBuffer sb = new StringBuffer();
sb.append(text);
sb.append(
"\n\n\n\n\n\n\n===================================================================\n\n\n\n\n\n\n\n\n\n\n");
// create an empty Annotation just with the given text
Annotation document = new Annotation(text);
// run all Annotators on this text
pipeline.annotate(document);
// these are all the sentences in this document
// a CoreMap is essentially a Map that uses class objects as keys and
// has values with custom types
List<CoreMap> sentences = document.get(SentencesAnnotation.class);
sb.append(
"\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n+++++++++++++++++++++++SENTENCES++++++++++++++++++++++++++++\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n");
for (CoreMap sentence : sentences)
{
// traversing the words in the current sentence
// a CoreLabel is a CoreMap with additional token-specific methods
sb.append("\n\n\n==============SENTENCE==============\n\n\n");
sb.append(sentence.toString());
sb.append("\n");
for (CoreLabel token : sentence.get(TokensAnnotation.class))
{
// this is the text of the token
sb.append("\n==============TOKEN==============\n");
String word = token.get(TextAnnotation.class);
sb.append(word);
sb.append(" : ");
// this is the POS tag of the token
String pos = token.get(PartOfSpeechAnnotation.class);
// this is the NER label of the token
sb.append(pos);
sb.append(" : ");
String lemma = token.get(LemmaAnnotation.class);
sb.append(lemma);
sb.append(" : ");
String ne = token.get(NamedEntityTagAnnotation.class);
sb.append(ne);
sb.append("\n");
}
// this is the parse tree of the current sentence
Tree tree = sentence.get(TreeAnnotation.class);
sb.append("\n\n\n=====================TREE==================\n\n\n");
sb.append(tree.toString());
// this is the Stanford dependency graph of the current sentence
SemanticGraph dependencies = sentence.get(CollapsedCCProcessedDependenciesAnnotation.class);
sb.append("\n\n\n");
sb.append(dependencies.toString());
}
However, when I add openie to the pipeline, the code fails.
props.setProperty("annotators", "tokenize, ssplit, pos, lemma, parse, ner, dcoref, openie");
The error I get is as follows:
annotator "openie" requires annotator "natlog"
Can anyone advise me on this?
The answer is that annotators in the pipeline can depend on each other. Simply add natlog to the pipeline. Crucially, dependencies must be added first, so
and as an aside,