Interpreting score of Shift-Reduce Parser parse on Stanford's LabeledScoredTreeNode

When parsing a sentence in Stanford, one can get the (negative) log probability of a parse by calling .score on the constituent-based output saved in TreeAnnotation. So, after creating a Stanford pipeline object called, say, my-basic-annotation object (using Clojure for these examples for the sake of brevity) and then parsing the sentence "The horse rode past the barn fell." like so

>>> (def basic-sentence-annotation (first (.get my-basic-annotation-object CoreAnnotations$SentencesAnnotation)))
>>> sentence-annotation
#<Annotation The horse rode past the barn fell.>
>>> (def basic-parsed  (.get basic-sentence-annotation TreeCoreAnnotations$TreeAnnotation))
>>> basic-parsed
#<LabeledScoredTreeNode (ROOT (S (NP (DT The) (NN horse)) (VP (VBD rode)  
(SBAR (S (NP (IN past) (DT the
) (NN barn)) (VP (VBD fell))))) (. .)))> The horse rode past the barn fell.>

one can call .score on basic-parsed:

>>> (.score basic-parsed)
-60.86048126220703

But when I use the Shift Reduce Parser instead and call .score on the TreeAnnotation I get a very large positive number rather than a negative log probability:

>>> (def sr-sentence-annotation (first (.get my-sr-annotation-object CoreAnnotations$SentencesAnnotation)))
>>> sr-sentence-annotation
#<Annotation The horse rode past the barn fell.>
>>> (def sr-parsed  (.get sr-sentence-annotation TreeCoreAnnotations$TreeAnnotation))
>>> sr-parsed
#<LabeledScoredTreeNode (ROOT (S (NP (NP (DT The) (NN horse)) (VP (VBD rode)   (PP (IN past) (NP (DT the) (NN barn))))) (VP (VBD fell)) (. .)))>
>>> (.score sr-parsed)
6497.833389282227

I've spent some time looking at the API and the Stanford mailing list for some interpretation of this score, but haven't had any luck (I think the SR parser is too new for people to have encountered this problem yet). Any help would be appreciated.

Solution

Yes, this is expected. The score of a tree output by the shift-reduce parser is the sum of the prediction scores of all the transitions instead of a negative log probability.

The parser uses a multiclass perceptron to predict the transitions and therefore the score of each transition and consequently also the score of a tree can be any number.

See the shift-reduce parser documentation for more information on the parser and references to papers discussing how it works.