I have been going through this blog post which contains a SimpleTagger example.
It says:
Given an input file "sample" as follows:
CAPITAL Bill noun
slept non-noun
here non-noun
where all but the last token on each line is a binary feature, and the last token on the line is the label name
So, how do I add the word-level features here?
Example: The number of syllables in the word, the length of the word, etc
Everything before the last token is treated as a feature. You should be able to add arbitrary features before this:
CAP SYL1 CHAR4 Bill noun
SYL3 CHAR9 responded non-noun
...