I am still playing with Stanford's CoreNLP and I am encountering strange results on a very trivial test of Coreference resolution.
Given the two sentences :
The hotel had a big bathroom. It was very clean.
I would expect "It" in sentence 2 to be coreferenced by "bathroom" or at least "a big bathroom" of sentence 1.
Unfortunately it point to "The hotel" which in my opinion is wrong.
Is there a way to solve this problem ? Do I need to train anything or is it supposed to work out of the box ?
Annotation a = getPipeline().getAnnotation("The hotel had a big bathroom. It was very clean.");
System.out.println(a.get(CorefChainAnnotation.class));
output :
{1=CHAIN1-["The hotel" in sentence 1, "It" in sentence 2], 2=CHAIN2-["a big bathroom" in sentence 1]}
Many thanks for your help.
Like many components in AI, the Stanford coreference system is only correct to a certain accuracy. In the case of coreference this accuracy is actually relatively low (~60 on standard benchmarks in a 0-100 range). To illustrate the difficulty of the problem, consider the following apparently similar sentence with a different coreference judgment:
The hotel had a big bathtub. It was very expensive.