I'm trying to convert an excel sheet into a doc object using spacy, I spent the last couple of days trying to go around it but it seems a bit challenging. I have opened the sheet in both openpyxl and pandas, I can read the excel sheet and output the content but I couldn't integrate spacy to create doc/token objects.
Is it possible to process excel sheets in spacy's pipeline?
Thank you!
Spacy has no support for excel. You could use pandas to read either the csv(if csv format) or excel file like
import pandas as pd
df = pd.read_csv(file)
or
df = pd.read_excel(file)
respectively. Select required text column and iterate over df 'column' values and pass them over to nlp() of spacy