Search code examples
stanford-nlp

Stanford SemanticGraph get sentence subject


How does one get the subject of a sentence (in a general way) using the SemanticGraph component from Stanford CoreNLP?

I've tried the code posted below, but the output indicates subject is null.

String sentence = "Carl has 84 Skittles.";
Annotation doc = InitUtil.initStanford(sentence, "tokenize, ssplit, pos, lemma, ner, parse");
SemanticGraph semGraph = doc.get(SENTENCE).get(0).get(DEPENDENCIES);
IndexedWord verb = semGraph.getFirstRoot();
IndexedWord subject = semGraph.getChildWithReln(verb, GrammaticalRelation.valueOf("nsubj"));
System.out.println(subject);

If I try the same code replacing the second to last line with the 3 lines below, I get the expected output of "Carl". The difference appears to be a private field of GrammaticalRelation called specific, but the value of this field appears to be sentence-specific. My question is how to get the subject in a way that can be applied to all or nearly all sentences.

Set<GrammaticalRelation> relations = semGraph.childRelns(verb);
GrammaticalRelation relation = relations.iterator().next();
IndexedWord subject = semGraph.getChildWithReln(verb, relation);

Solution

  • Turns out the problem wasn't with the specific field.

    SemanticGraph.getChildWIthReln relies on GrammaticalRelation.equals(), which checks if the languages of the two objects are compatible. GrammaticalRelation.valueOf(String) returns a GrammaticalRelation with language as Language.English, while the Stanford Parser uses Language.UniversalEnglish. The two languages are incompatible for some reason. Changing the call to GrammaticalRelation.valueOf(String) to GrammaticalRelation.valueOf(Language, String) solved the problem.