Search code examples
pythonfor-looppandasappendexport-to-excel

loop though excel files do something and save them to new folder python pandas


I have am having trouble with a for loop. I dont know what happened, this used to work, but now it tells me "No objects to concatenate".

I want to accomplish two things. I want to loop through all excel files in a folder. For each excel file I want to remove 2 lines of header data (which the code below already does).

Then I want to save each edited file with its original file name in a new folder. But also, save a new file where all the data from each edited file is appended together.

I thought I had the append correct but for some reason it is not working anymore.

import os
import pandas as pd
import numpy as np

from pandas import Series, DataFrame

appended_data = []

path = 'C:\Test\TestRawFile'
for fn in os.listdir(path):
    if os.path.isfile(fn):
        # Import the excel file and call it xlsx_file
        xlsx_file = pd.ExcelFile(fn)
        # View the excel files sheet names
        xlsx_file.sheet_names
        # Load the xlsx files Data sheet as a dataframe
        df = xlsx_file.parse('Sheet1',header= None)
        df_NoHeader = df[2:]
        data = df_NoHeader
        appended_data.append(data)
appended_data = pd.concat(appended_data)

Here is the error I currently get.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-46-962ccf280c0b> in <module>()
     11         data = df_NoHeader
     12         appended_data.append(data)
---> 13 appended_data = pd.concat(appended_data)

C:\Anaconda2\lib\site-packages\pandas\tools\merge.pyc in concat(objs, axis, join, join_axes, ignore_index, keys, levels, names, verify_integrity, copy)
    832                        keys=keys, levels=levels, names=names,
    833                        verify_integrity=verify_integrity,
--> 834                        copy=copy)
    835     return op.get_result()
    836 

C:\Anaconda2\lib\site-packages\pandas\tools\merge.pyc in __init__(self, objs, axis, join, join_axes, keys, levels, names, ignore_index, verify_integrity, copy)
    865 
    866         if len(objs) == 0:
--> 867             raise ValueError('No objects to concatenate')
    868 
    869         if keys is None:

ValueError: No objects to concatenate

Solution

  • Most likely the reason script worked before and not now is you moved its location from the folder of Excel files as this code uses relative paths. Try using absolute paths where you concatenate the folder path name to file name. Consider using os.path.join() which even helps in saving to new folder:

    dfList = []
    path = 'C:\\Test\\TestRawFile' 
    newpath = 'C:\\Path\\To\\New\\Folder'
    
    for fn in os.listdir(path): 
      # Absolute file path
      file = os.path.join(path, fn)
      if os.path.isfile(file): 
        # Import the excel file and call it xlsx_file 
        xlsx_file = pd.ExcelFile(file) 
        # View the excel files sheet names 
        xlsx_file.sheet_names 
        # Load the xlsx files Data sheet as a dataframe 
        df = xlsx_file.parse('Sheet1',header= None) 
        df_NoHeader = df[2:] 
        data = df_NoHeader 
        # Save individual dataframe
        data.to_excel(os.path.join(newpath, fn))
    
        dfList.append(data) 
    
    appended_data = pd.concat(dfList)
    appended_data.to_excel(os.path.join(newpath, 'master_data.xlsx'))