Search code examples
classificationcrfsuite

CRFSuite predictions are swallowed if label is ':'?


I am using CRFSuite for sequence classification (POS tagging). To my surprise it seems like CRFSuite does not like the label':' Units or tokens that have ':' as actual label are entirely skipped (no remark in the prediction output about a missing or skipped item)

I use other punctuation-related labels such as '.' or ',', but these are correctly used and outputted.

Has someone made a similar experience or nows why ':' is skipped ?


Solution

  • From http://www.chokkan.org/software/crfsuite/tutorial.html:

    CRFsuite accepts any string as an attribute name as long as the string does not contain a colon character (that is used to separate an attribute name and its weight).

    So if you have an attribute like w[0]=the:0.5, the attribute name is "w[0]=the" and the weight is 0.5.