I'm sorry about the title, I really didn't know how to phrase it, but hopefully this example will make it clear.
Basically,
For the following sentence:
Ashley and Brian are drinking water.
I want the noun chunk to be "Ashley and Brian" instead it is, "Ashley", "Brian"
Another example is:
Types of clothes include shirts, pants and trousers.
I want the noun chunk to be "shirts, pants and trousers" instead of "shirts" "pants" "trousers"
How do I solve this problem?
What you are describing is not a noun chunk. The conjuncts
feature is closer to what you want.
This might not work for complex sentences, but at least it'll cover your examples and typical cases.
import spacy
nlp = spacy.load("en_core_web_sm")
texts = [
"Ashley and Brian are drinking water.",
"Types of clothes include shirts, pants and trousers.",
]
for text in texts:
print("-----")
print(text)
checked = 0
doc = nlp(text)
for tok in doc:
if tok.i < checked: continue
if tok.pos_ not in ('NOUN', 'PROPN'): continue
if tok.conjuncts:
print(doc[tok.left_edge.i:tok.right_edge.i+1])
checked = tok.right_edge.i + 1