I have a WORDTABLE containing numbers expressed as strings (zero, one, two, ..., n) plus the respective digits as features. I am trying to annotate a sequence of a fixed length of stringified numbers.
E.g.:
one two three four -> should be annotated
one two three four five six -> should not be annotated
So far I have done
WORDTABLE numbers = "numbers.csv";
DECLARE Annotation number(STRING int_string, STRING digit);
DECLARE Annotation numberSequence;
Document{-> MARKTABLE(number, 1, numbers, "digit" = 2)};
(number number) {-> MARK(numberSequence)};
This matches a sequence containing n stringified number, what I want is establishing the length of the sequence, something like:
number[4,4] {-> MARK(numberSequence)};
where the minimum and maximum tokens in the sentence containing the stringified numbers should be equal, for example, to 4. Is it possible to do this?
Here's an exemplary rule for annotating text positions if there are exactly four annotations of the type number
:
ANY{-PARTOF(number)} @number[4,4] {-> MARK(numberSequence)} ANY{-PARTOF(number)};
DISCLAIMER: I am a developer of UIMA Ruta