Search code examples
pythonpandasjupyter-notebookamazon-sagemakerpython-s3fs

NotImplementedError: Text mode not supported, use mode='wb' and manage bytes in s3fs


I know that there are a similar question but it is more general and not specific of this package. I am saving a pandas dataframe within a Sagemaker Jupyter notebook into a csv in S3 as follow:

df.to_csv('s3://bucket/key/file.csv', index=False)

However I am getting the following error:

NotImplementedError: Text mode not supported, use mode='wb' and manage bytes

The code more or less is that I read a csv from S3, make some preprocessing on it and then saves it to S3. I can read csv from S3 successfully with:

df.read_csv('s3://bucket/key/file.csv')

The object that I am trying to save to S3 is indeed a pandas.core.frame.DataFrame

In the notebook I can see using !pip show package that I have pandas 0.24.2 and s3fs 0.1.5.

What could be the problem?


Solution

  • Can you Please try

    df.to_csv("s3://bucket/key/file.csv", index=False, mode='wb')
    

    It should fix your error. The default mode is w which writes in the file system as text and not bytes. Where as s3 expects the data to be bytes. hence you have to specify mode as wb(write bytes) while writing the dataframe as csv to the filesystem.