Search code examples
uimaruta

UIMA Ruta Creating annotation with features separated by some text


I have some text with annotations created like the following:

wewf.werwfwef. wewfwefwwew. wefewefwff
AnnotationA
asdfawece aefae eafewfaefa aefafe ceaewfae
adfcaecae acaeaet aegaegageg caeacdaefa
AnnotationB
sadaeceaee aef aewfaegg rresf ceeaefaeaeaf
adfcaecae acaeaet aegaegageg caeacdaefa
AnnotationA
adfcaecae acaeaet aegaegageg caeacdaefa
adfcaecae acaeaet aegaegageg caeacdaefa
AnnotationB
adfcaecae acaeaet aegaegageg caeacdaefa
adfcaecae acaeaet aegaegageg caeacdaefa

I want to create an annotation with AnnotationA and its closest AnnotationB as features. How should I express this in Ruta?

I have tried the following incorrect way:

DECLARE Annotation TargetAnnotation (AnnotationA ana, AnnotationB anb);
Document {-> CREATE(TargetAnnotation, "ana" = AnnotationA, "anb" = AnnotationB)};

The rule covers the whole document. What I just want is annotation with AnnotationA and its closest AnnotationB as feature. Thanks very much for any answer.


Solution

  • There are several ways to specify this in UIMA Ruta and they mainly depend on the offset the created TargetAnnotation should get. The CREATE action uses the span matched by the rule element in order to identify the values for the features.

    If the offsets of the created annotation do not really matter, then you can simply use the span combining both annotations AnnotationA and AnnotationB:

    (AnnotationA # AnnotationB){-> CREATE(TargetAnnotation, "ana" = AnnotationA, "anb" = AnnotationB)};
    

    Mind that this rule introduces a sequential dependency between the two annotations. You can also specify rules that do not care, but they will probably return too many matches. It depends on what you want to accomplish.

    If the offset of the created annotation should equal one of the provided annotations, e.g., AnnotationA, then you should use GATHER instead of CREATE. GATHER allows one to specify the index of the rule element whose match should be assigned to the feature.

     AnnotationA{-> GATHER(TargetAnnotation, "ana" = 1, "anb" = 3)} # AnnotationB;
    

    (I am a developer of UIMA Ruta)