community. I hope you can offer some guidance as I am new to python programming
I am trying to read a folder that contains 15 excel files and each excel file has 30 worksheets. I am trying to read each excel worksheet separately because I need to extract the features from the 30 sheets. I need to read the excel sheets using a pandas data frame, however, when I try to do it appears empty. I am not sure how to proceed from here, I have tried to place just df into the df_frame but I get an error.
Get the current working directory print("The current working directory is:")
print(os.getcwd())
path = r'C:\Users' excel_files = glob.glob(os.path.join(path, "*.xlsx"))
for files in excel_files:
# read the csv file
df = pd.read_excel(files,sheet_name=None)
df_frame = pd.DataFrame(df,columns= ['Time','Sensor'])
# print the location and filename
print (df_frame)
display(df_frame)
#print('Location:', files)
#print('File Name:', files.split("\\")[-1])
# print the content
#print('Content:')
#display(df)
I have tried to work around the message error, however, I don't know how to make the df_frame = pd.DataFrame(df,columns= ['Time','Sensor']) to be applied to all the excel worksheet in all the excel files.
P.s I don't need to join the excel worksheets into 1 single sheet at the moment. I just want to be able to read separately to be able to proceed with the feature extraction and then join the extracted features together.
The line df = pd.read_excel(files,sheet_name=None)
creates a dictionary, with the dictionary keys being the string name of each sheet and the dictionary values being the data found in each sheet in a pandas data frame.
You can loop through like so:
df_dict = pd.read_excel(files,sheet_name=None)
for sheet_name, df in df_dict.items():
## Rest of code here