I have 5 Excel files, each file contains 8 sheets, each sheet contains around 30 rows. So this means 1 file has in total 30 x 8 = 240 rows. Is there a speedy trick I can use to combine all these 5 files (including sheets) into 1 Excelfile with in total 240 x 5 rows = 1200 rows?
This is my current code:
import os
import pandas as pd
files = os.listdir('c:\data\KM\Desktop\my_folder')
os.chdir(r'c:\data\KM\Desktop\my_folder')
df = pd.DataFrame()
for file in files:
if file.endswith('.xlsx'):
df = df.append(pd.read_excel(file))
df.head()
df.to_excel('all_files.xlsx')
Now with this code I have 2 problems:
Appreciate your help all.
Use read_excel
with sheet_name=None
for all sheets, join together by concat
for list of DataFrames and last use it again for one big DataFrame:
import glob
files = glob.glob(r'c:/data/KM/Desktop/my_folder/*.xlsx')
dfs = (pd.concat(pd.read_excel(fp, sheet_name=None)) for fp in files)
dfbig = pd.concat(dfs, ignore_index=True)
Edit: For remove last sheetname convert orderdict to lists of DataFrame and remiove last by indexing:
files = glob.glob(r'D:/Dropbox/work-joy/so/files/*.xlsx')
dfs = (pd.concat([v for k, v in pd.read_excel(fp, sheet_name=None).items()][:-1])
for fp in files)
df = pd.concat(dfs, ignore_index=True)