Search code examples
pythonspacyhighlight

Highlight text parts based on labels


thanks to fellow stackoverflowrians I have data labels that I would like to high light in the text:

eg. I have product description

Description: Tampered black round grey/natural swing with yellow load-bearing left hook

Features were extracted as

colors=['black','grey','natural','yellow']
shape = ['round']
direction= ['left']

In Spacy it is possible to highlight the features like this

enter image description here

Is there any possibility to highlight it also like this from the data I have as labels? So that I have labels shown in the text too? I dont know if Spacy is the good tool or any other is better?

Thanks.


Solution

  • I'm not entirely sure what you're asking, but you can put entities of your own on the spaCy Doc object and pass them to Displacy.

    To simply set entities manually, you can do this:

    doc = nlp(...)
    span = doc[0:1] # whatever span of the doc you want to highlight
    span.ent_label_ = "COLOR" # the label you want
    ents = [span] # in reality you could do more than one
    doc.ents = ents
    

    If you have word list and need to look for words, you can use rule-based matching with an EntityRuler. Check the rule-based matching guide.