Search code examples
pythonexcelvbapandaslistobject

Read Excel sheet table (Listobject) into python with pandas


There are multiple ways to read excel data into python. Pandas provides aslo an API for writing and reading

import pandas as pd
from pandas import ExcelWriter
from pandas import ExcelFile

df = pd.read_excel('File.xlsx', sheetname='Sheet1')

That works fine.

BUT: What is the way to access the tables of every sheet directly into a pandas dataframe??

enter image description here

The above picture shows a sheet including a table SEPARATED THAN CELL (1,1).

Moreover the sheet might include several tables (listobjects in VBA).

I can not find anywhere the way to read them into pandas.

Note1: It is not possible to modify the workbook to bring all the tables towards cell(1,1). Note2: I would like to use just pandas (if it is possible) and minimize the need to import other libraries. But it there is no other way I am ready to use other lybray. In any case I could not manage with xlwings for instance.

here it looks like its possible to parse the excel file, but no soilution is provided for tables, just for complete sheets.

The documentation of pandas does not seem to offer that possibility.

Thanks.


Solution

  • I understand that this question has been marked solved already, but I found an article that provides a much more robust solution: Full Post

    I suppose a newer version of this library supports better visibility of the workbook structure. Here is a summary:

    1. Load the workbook using the load_workbook function from openpyxl
    2. Then, you are able to access the sheets within, which contains collection of List-Objects (Tables) in excel.
    3. Once you gain access to the tables, you are able to get to the range addresses of those tables.
    4. Finally they loop through the ranges and create a pandas data-frame from it.

    This is a nicer solution as it gives us the ability to loop through all the sheets and tables in a workbook.