Add a month column to each excel file and then merge all files into a .csv

im new using python and for a work purpose im here to ask your help.

I have on a same folder 12 excel files month by month that contains columns like: Product_Name, Quantity and Total_Value

So, what i would like to do but I have no idea how to do are:

Add a month column on each of those files that contain the same date from the file name
Merge those excel files into a unique file

For example:

January-21.xls:

Product_Name (type:string)	Quantity (type:float)	Total_Value (type:float)	Month (type:Date)
Product A	10	250	"File Name" (January-21)
Product B	20	500	"File Name" (January-21)
Product C	15	400	"File Name" (January-21)

February-21.xls:

Product_Name (type:string)	Quantity (type:float)	Total_Value (type:float)	Month (type:Date)
Product A	40	800	"File Name" (February-21)
Product B	25	700	"File Name" (February-21)
Product C	30	500	"File Name" (February-21)

After merge:

Product_Name (type:string)	Quantity (type:float)	Total_Value (type:float)	Month (type:Date)
Product A	10	250	"File Name" (January-21)
Product B	20	500	"File Name" (January-21)
Product C	15	400	"File Name" (January-21)
Product A	40	800	"File Name" (February-21)
Product B	25	700	"File Name" (February-21)
Product C	30	500	"File Name" (February-21)

Is it possible? Sorry for my bad English, I'm not a native speaker.

I really appreciate your help!

Edit.1

This is how i merge, create a csv file and convert to a dataframe using pandas:


import pandas as pd
import os

path = "/content/drive/MyDrive/Colab_Notebooks/sq_datas"
files = [file for file in os.listdir(path) if not file.startswith('.')] # Ignore hidden files

all_months_data = pd.DataFrame()

for file in files:
    current_data = pd.read_excel(path+"/"+file)
    all_months_data = pd.concat([all_months_data, current_data])
    
all_months_data.to_csv("/content/drive/MyDrive/Colab_Notebooks/sq_datas/all_months.csv", index=False)

So, the main problem to me is create a loop for to add the month column before merge all those files in one.

Solution

At a basic level, you first need to read your Excel files, such as with pandas.read_excel:

import pandas as pd

jan21_df = pd.read_excel('January-21.xls')
feb21_df = pd.read_excel('February-21.xls')

You wrote type:Date for the Month column. To add a date column to each dataframe:

jan21_df['Month'] = pd.to_datetime('2021-01-01')
feb21_df['Month'] = pd.to_datetime('2021-02-01')

But if you wanted the file name as string:

jan21_df['Month'] = "File Name (January-21)"
feb21_df['Month'] = "File Name (February-21)"

Then to combine the two dataframes:

combined = pd.concat([jan21_df, feb21_df])

This is a proof of concept. There are ways to automate this further based on the requirements.

EDIT: based on the edit in the OP, minor addition to the loop:

for file in files:
    current_data = pd.read_excel(path+"/"+file)
    current_data['Month'] = file
    all_months_data = pd.concat([all_months_data, current_data])