Instead of manually inputing Excel sheets parameters as follows:
import pandas as pd
df1 = pd.read_excel(r"C:\Users\XY\Sales2020.xlsm",
sheet_name = "Europe",usecols=[1,2,4,6],header=4) #reads sheet "Europe", selected columns and skips first 4 rows
df1["Continent"]= "Europe" #adds a new column with sheet name
df1=pd.DataFrame(df1) #creates df
df1.columns=["ID", "Product", "Quantity","Price","Continent"] #renames columns in df
df2 = pd.read_excel(r"C:\Users\XY\Sales2020.xlsm",
sheet_name = "North America",usecols=[1,2,4,6],header=4)
df2["Continent"]= "North America"
df2=pd.DataFrame(df2)
df2.columns=["ID", "Product", "Quantity","Price","Continent"]
df = pd.concat([df1, df2]) #concats the dfs
I would like to automatically iterate through the sheets and put data from all sheets into a dataframe. I tried something like below however it doesn´t do the job as the loop takes data only from the last sheet in the list:
import pandas as pd
sheets=["Europe","North America"]
for i in sheets:
dataset = pd.read_excel(r"C:\Users\XY\Sales2020.xlsm",
sheet_name = i,usecols=[1,2,4,6],header=4) #read Excel
dataset["Continent"]= i #adds a new column with sheet name
dataset = pd.DataFrame(dataset) #creates df
dataset.columns=["ID", "Product", "Quantity","Price","Continent"] #renames columns in df
df= dataset.append(dataset) #this should append data from sheets into a single df
Do you have any ideas please how could I solve this?
Thanks a lot
No need to create a new dataframe when dataset is already a dataframe.
import pandas as pd
sheets=["Europe","North America"]
df_list=[]
for i in sheets:
dataset = pd.read_excel(r"C:\Users\XY\Sales2020.xlsm",
sheet_name = i,usecols=[1,2,4,6],header=4) #read Excel
dataset["Continent"]= i #adds a new column with sheet name
df_list.append(dataset)
df=pd.concat(df_list)