I have many excel files in a folder. I want to load all sheets except the first sheet into a dataframe. Each excel has different number of sheets. For example I have E1.xlsx, E2.xlsx. E1 has 5 sheets, and E2 has 9. I need to load sheet 1,2,3,4 from E1, and 1, 2, 3, 4, 5, 6, 7, 8 from E2 (without sheet 0).
filenames = glob.glob( "/*.xlsx")
d = []
for filename in filenames:
df = pd.read_excel(filename, index_col=None, sheet_name=1) # don't know how to set sheet number
d.append(df)
data = pd.concat(d, axis=0, ignore_index=True)
I have my code here, but don't know how to set the sheet number. (the column in sheet is totally different from other sheet, I think it's not a good choice to read all sheets and delete)
Can you try the following:
import glob
import pandas as pd
filenames = glob.glob( "/*.xlsx")
d = []
not_req_sheets = [0]
for filename in filenames:
xl = pd.ExcelFile(filename)
for sheet_name in xl.sheet_names:
if sheet_name not in not_req_sheets:
df = xl.parse(sheet_name, index_col=None)
d.append(df)
data = pd.concat(d, axis=0, ignore_index=True)