Is there an easier way to load an excel file directly into a Numpy array?
I have looked at the numpy.genfromtxt
autoloading function from numpy documentation but it doesn't load excel files directly.
array = np.genfromtxt("Stats.xlsx")
ValueError: Some errors were detected !
Line #3 (got 2 columns instead of 1)
Line #5 (got 5 columns instead of 1)
......
Right now I am using using openpyxl.reader.excel
to read the excel file and then append to numpy 2D arrays. This seems to be inefficient.
Ideally I would like to have to excel file directly loaded to numpy 2D array.
Honestly, if you're working with heterogeneous data (as spreadsheets are likely to contain) using a pandas.DataFrame
is a better choice than using numpy
directly.
While pandas
is in some sense just a wrapper around numpy, it handles heterogeneous data very very nicely. (As well as a ton of other things... For "spreadsheet-like" data, it's the gold standard in the python world.)
If you decide to go that route, just use pandas.read_excel
.