Spacy - Span that completely lie within another Span

I have docs in spacy that use spans, such as:

sent = 'I eat 5 apples and 2 bananas.'
doc = nlp(sent)

doc.spans['sc'] = [
   Span(doc, 2, 3, 'Ingredient'),
   Span(doc, 5, 6, 'Ingredient'),
   Span(doc, 2, 6, 'Meal')]

How can I iterate over all spans with the label 'Meal' and show the spans that lie completely within the boundries of those span(s)? I know there is something for ents within spans. But that is not what I'm looking for.

Solution

spaCy's SpanGroup object has a useful has_overlap property that can help you with an initial check. Then, you can use a simple straightforward approach by writing a couple of loops or list comprehensions to search within your defined spans using the .start and .end properties.

Here's how I would write a snippet to handle such a task:

import spacy
from spacy.tokens import Span

nlp = spacy.load('en_core_web_sm')

sent = 'I eat 5 apples and 2 bananas.'
doc = nlp(sent)

doc.spans['sc'] = [
    Span(doc, 0, 1, 'Subject'),
    Span(doc, 1, 2, 'Verb'),
    Span(doc, 3, 4, 'Ingredient'),
    Span(doc, 6, 7, 'Ingredient'),
    Span(doc, 2, 7, 'Meal')]

if doc.spans['sc'].has_overlap:
    meal_start_ends = [(span.start, span.end) for span in doc.spans['sc'] if span.label_ == 'Meal']
    meal_ingredients = [[ig for ig in doc.spans['sc'] if ig.start >= meal[0] and ig.end <= meal[1] and ig.label_=='Ingredient'] for meal in meal_start_ends]
    print(meal_ingredients)

This little snippet should print out [[apples, bananas]], which is hopefully what you wanted to achieve.