I have a python script that is being run by cron
. The script imports the pandas module and uses read_csv
to load a csv to a data frame and then later saves it to another csv. 'apath' is the absolute path to the file:
statedata_raw=pd.read_csv(apath+'statedata.csv')
statedata_raw.to_csv(apath+'state_data.csv',index=False)
The permissions on the csv file are set correctly -rwxr-xr-x
when I run it in the command line, everything works fine. When I run it via cron I get the following error:
Traceback (most recent call last):
File "/users/maderman/wdtest.py", line 21, in <module>
statedata_raw=pd.read_csv(apath+'statedata.csv')
File "/opt/miniconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 676, in parser_f
return _read(filepath_or_buffer, kwds)
File "/opt/miniconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 448, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "/opt/miniconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 880, in __init__
self._make_engine(self.engine)
File "/opt/miniconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 1114, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/opt/miniconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 1891, in __init__
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 374, in pandas._libs.parsers.TextReader.__cinit__
File "pandas/_libs/parsers.pyx", line 678, in pandas._libs.parsers.TextReader._setup_parser_source
OSError: Initializing from file failed
I verified that pandas itself is loading and that the to_csv
is working by replacing the read_csv
. When I replaced the read_csv
with the following code to manually create a dataframe, everything worked fine, running in command line and running in cron:
cat=['a','a','a','a','a','b','b','b','b','b']
val=[1,2,3,4,5,6,7,8,9,10]
columns=['cat','val']
data=[cat,val]
dict={key:value for key,value in zip(columns,data)}
statedata_raw=pd.DataFrame(data=dict)
I found another post that suggested passing the argument engine='python'
to the read_csv
, but that didn't do anything.
So I know that:
The issue seems to be specifically related to the read_csv
commmand.
Any suggestions would be appreciated.
The framing on this question was wrong and it boiled down to a permissions issue. A better question was posted and answered here: stackoverflow.com/questions/62353610