Stanford CoreNLP wrong coreference resolution

I am still playing with Stanford's CoreNLP and I am encountering strange results on a very trivial test of Coreference resolution.

Given the two sentences :

The hotel had a big bathroom. It was very clean.

I would expect "It" in sentence 2 to be coreferenced by "bathroom" or at least "a big bathroom" of sentence 1.

Unfortunately it point to "The hotel" which in my opinion is wrong.

Is there a way to solve this problem ? Do I need to train anything or is it supposed to work out of the box ?

    Annotation a = getPipeline().getAnnotation("The hotel had a big bathroom. It was very clean.");

    System.out.println(a.get(CorefChainAnnotation.class));

output :

{1=CHAIN1-["The hotel" in sentence 1, "It" in sentence 2], 2=CHAIN2-["a big bathroom" in sentence 1]}

Many thanks for your help.

Solution

Like many components in AI, the Stanford coreference system is only correct to a certain accuracy. In the case of coreference this accuracy is actually relatively low (~60 on standard benchmarks in a 0-100 range). To illustrate the difficulty of the problem, consider the following apparently similar sentence with a different coreference judgment:

The hotel had a big bathtub. It was very expensive.