I am trying to walk a directory tree and for each csv encountered on the walk I would like to open the file and read columns 0 and 15 into a data-frame (after which I'll process and move onto the next file. I can walk the directory tree using the following:
rootdir = r'C:/Users/stacey/Documents/Alco/auditopt/'
for dirName,sundirList, fileList in os.walk(rootdir):
print('Found directory: %s' % dirName)
for fname in fileList:
print('\t%s' % fname)
df = pd.read_csv(fname, header=1, usecols=[0,15],parse_dates=[0], dayfirst=True,index_col=[0], names=['date', 'total_pnl_per_pos'])
print(df)
but I'm getting the error message:
FileNotFoundError: File b'auditopt.os-pnl.BBG_XASX_ARB_S-BBG_XTKS_7240_S.csv' does not exist.
I am trying to read files which do exist. They are in an MS Excel .csv format so I don't know if that is an issue - if it is, would someone let me know how I read an MS Excel .csv into a data-frame please.
The full stack trace is as follows:
Found directory: C:/Users/stacey/Documents/Alco/auditopt/
Found directory: C:/Users/stacey/Documents/Alco/auditopt/roll_597_oe_2017-03-10
tradeopt.os-pnl.BBG_XASX_ARB_S-BBG_XTKS_7240_S.csv
Traceback (most recent call last):
File "<ipython-input-24-3753e367432d>", line 1, in <module>
runfile('C:/Users/stacey/Documents/scripts/Pair_Results_Code_1.0.py', wdir='C:/Users/stacey/Documents/scripts')
File "C:\Anaconda\lib\site-packages\spyder\utils\site\sitecustomize.py", line 866, in runfile
execfile(filename, namespace)
File "C:\Anaconda\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/Users/stacey/Documents/scripts/Pair_Results_Code_1.0.py", line 49, in <module>
main()
File "C:/Users/stacey/Documents/scripts/Pair_Results_Code_1.0.py", line 36, in main
df = pd.read_csv(fname, header=1, usecols=[0,15],parse_dates=[0], dayfirst=True,index_col=[0], names=['date', 'total_pnl_per_pos'])
File "C:\Anaconda\lib\site-packages\pandas\io\parsers.py", line 646, in parser_f
return _read(filepath_or_buffer, kwds)
File "C:\Anaconda\lib\site-packages\pandas\io\parsers.py", line 389, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "C:\Anaconda\lib\site-packages\pandas\io\parsers.py", line 730, in __init__
self._make_engine(self.engine)
File "C:\Anaconda\lib\site-packages\pandas\io\parsers.py", line 923, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "C:\Anaconda\lib\site-packages\pandas\io\parsers.py", line 1390, in __init__
self._reader = _parser.TextReader(src, **kwds)
File "pandas\parser.pyx", line 373, in pandas.parser.TextReader.__cinit__ (pandas\parser.c:4184)
File "pandas\parser.pyx", line 667, in pandas.parser.TextReader._setup_parser_source (pandas\parser.c:8449)
FileNotFoundError: File b'tradeopt.os-pnl.BBG_XASX_ARB_S-BBG_XTKS_7240_S.csv' does not exist
When reading in the file, you need to provide the full path. os.walk
by default does not supply the full path. You'll need to supply it yourself.
Use os.path.join
to make this easy.
import os
full_path = os.path.join(dirName, file)
df = pd.read_csv(full_path, ...)