Search code examples
pythonexceldataframeconcatenation

How to stack data from excel files using python


I have a folder with an increasing number of excel files containing daily data. I want to create a dataframe that expands as the number of files increase. The new data must stack under the existing data, to extend the time-series and then save the dataframe to a new excel output file. I have this code so far. Some parts work and I do get an output file, however it is stacked in random order and not in the order from the folder/directory.

# loop over excel files in cwd                                              
 all_data =  pd.DataFrame() # create empty list                             
 list = []                                                                  
 for f in glob.glob("/Users/xxxx/xxxx/xxxx/Data/*.xlsx"):    
df = pd.read_excel(f, sheet_name= 'data')                              
    list.append(df)                                                                                                          
                                                                           
 all_data = pd.concat(list)
                                                
# save the data                                                            
 writer = pd.ExcelWriter('output.xlsx')                                     
 all_data.to_excel(writer, 'sheet1')                                       
 writer.save()                                                             

Thank you


Solution

  • all_data =  pd.DataFrame() # create empty list
    list = []
    for f in glob.glob("/Users/xxxx/xxxx/xxxx/Data/*.xlsx"):
        df = pd.read_excel(f, sheet_name= 'data') 
        list.append(df)
    
    all_data = pd.concat(list)
    all_data.to_excel("/Users/xxxx/xxxx/xxxx/Data/all_data.xlsx")