How can we group all annotations between two annotations?
I'm new to GATE and am trying to group annotations together , Not sure if we can do this , Please help. For Example In the following text :
Page-1
Age:53
Person: Nathan
Page-2
Treatment : Initial Evaluation
History: Yes
Page-3
..........
If my Gazetteer list consists of different tags, page tag for each page number, age, person, Treatment, History etc. I want to group all tags from Page-1 to Page-2 under Page-1 Annotation and all tags between Page-2 and Page-3 under Page-2.
Please let me know if more information required on this question.
Thanks in advance.
I'm not entirely sure what you mean by "group together" but you can certainly create annotations that span across the content of each "page". Assuming you have a PageNumber
annotation on each "Page-1", "Page-2" etc. then you can use something like this to create annotations spanning from one PageNumber
to the next. I'm using a control = once
JAPE to do this, you could equivalently use a Groovy script or a custom PR
Imports: { import static gate.Utils.*; }
Phase: PageSpans
Input: PageNumber
Options: control = once
Rule: PageSpan
({PageNumber})
-->
{
try {
List<Annotation> numbers = inDocumentOrder(inputAS.get("PageNumber"));
for(int i = 0; i < numbers.size(); i++) {
outputAS.add(start(numbers.get(i)), // from start of this PageNumber, to...
(i+1 < numbers.size()
? start(numbers.get(i+1)) // start of the next number, or...
: end(doc) // ...if no more PageNumbers then end of document
),
"Page",
// store the text under the PageNumber as a feature of Page
featureMap("id", stringFor(doc, numbers.get(i))));
}
} catch(InvalidOffsetException e) {
throw new JapeException("Invalid offset from existing annotation", e);
}
}
In your comment you ask about moving all the annotations under each "page" into a separate annotation set. This would be relatively straightforward once you have done the above, and if you have the page number as a feature on your Page
annotations as I have done with the "id" feature. Then you could define another JAPE that does something like this:
Imports: { import static gate.Utils.*; }
Phase: SetPerPage
Input: Age X Y // and whatever other annotation types you want to copy
Options: control = all
Rule: MoveToPageSet
({Age}|{X}|{Y}):entity
-->
:entity {
try {
for(Annotation e : entityAnnots) {
// find the (only) Page annotation that covers this entity
Annotation thePage = getOnlyAnn(getCoveringAnnotations(inputAS, e, "Page"));
// get the corresponding annotation set
AnnotationSet pageSet = doc.getAnnotations(
(String)thePage.getFeatures().get("id"));
// and copy the annotation into it
pageSet.add(start(e), end(e), e.getType(), e.getFeatures());
}
} catch(InvalidOffsetException e) {
throw new JapeException("Invalid offset from existing annotation", e);
}
// optionally remove from input set
// inputAS.removeAll(entityAnnots);
}