I want to run my code on all files in a directory. The code works fine on a single file, but my attempts to iterate on multiple files tells me
FileNotFoundError: [Errno 2] No such file or directory: 'file.xlsx'
directory = r"C:/Users/name/Desktop/folder/2018"
arrivals_aggregated = pd.DataFrame()
print(os.listdir(directory))
for filename in os.listdir(smt_directory):
print('current file is ' + filename)
x = pd.ExcelFile(filename)
symbols = x_symbols(x)
arv = x.parse(sheet_name='Arrivals', skiprows=5, usecols=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23])
arrivals = x_arrivals(arv, x)
arrivals_aggregated.append(arrivals)
I expect it to iterate across all the files in the directory, processing and aggregating the results to a big dataframe arrivals_aggregated. Instead it is stopping at x = pd.ExcelFile(filename), saying that file not found, even though it is there and even prints when I include
print('current file is ' + filename)
It is failing on the very first file in the folder without ever processing the code.
Whether this works depends on where you run the script. If filename
is not present in the directory where you ran your script, then you will get a FileNotFoundError
.
I would instead do:
x = pd.ExcelFile(os.path.sep.join([directory, filename]))
which will ensure you're passing the true file location to pd.ExcelFile
.