I use spacy, token.conjuncts
to get the conjuncts of each token.
However, the return type of the token.conjuncts
is tuple
, but I want to get the span
type, for example:
import spacy
nlp = spacy.load("en_core_web_lg")
sentence = "I like to eat food at the lunch time, or even at the time between a lunch and a dinner"
doc = nlp(sentence)
for token in doc:
conj = token.conjuncts
print(conj)
#output: <class 'tuple'>
Does anyone know how to convert this tuple
into span
type?
Or maybe how can I directly get the span
type of the conjuncts?
The reason I need span
type is, I want to use the conjuncts (span)
to locate the location this conjunct, for example, this conjunct belongs to which noun chunk or a split (whatever way I use to split them).
Currently, I convert the tuple
to str
to iterate all the splits or noun chunks to search whether or not a split/noun chunk contains this conjunct
.
However, a bug exists, for example, when a conjunct
(of a token) appears in more than one split/noun chunk, then there will be a problem to locate the exact split which contains that conjunct
. Because I only consider the str
but not the index
or id
of the conjunct
. If I can have a span
of this conjunct
, then I can locate the exact location of the conjunct
.
Please feel free to comment, thanks in advance!
token.conjuncts
returns a tuple of tokens. To get a span, call doc[conj.i: conj.i+1]
import spacy
nlp = spacy.load('en_core_web_sm')
sentence = "I like oranges and apples and lemons."
doc = nlp(sentence)
for token in doc:
if token.conjuncts:
conjuncts = token.conjuncts # tuple of conjuncts
print("Conjuncts for ", token.text)
for conj in conjuncts:
# conj is type of Token
span = doc[conj.i: conj.i+1] # Here's span
print(span.text, type(span))