Search code examples
google-cloud-storagegoogle-cloud-datalab

Using Google datalab: write csv to storage


I'm trying to use google datalab, but I can't write csv to GCS(Google Cloud Storage) well.

import pandas as pd
from pandas import DataFrame
from io import BytesIO
df = DataFrame({"a":[1,2],"b":1})
print(df)
>>    | a | b
>>  0 | 1 | 1
>>  1 | 2 | 1

In stackoverflow, I found this command

%storage write --object gs://my-bucket/data/test.csv --variable df

But if I use this command, reading data doesn't work well. Because the data is not separated by comma (separated by space). and it includes index.

%storage read --object gs://my-bucket/data/test.csv --variable test_file

df2 = pd.read_csv(BytesIO(test_file))
print(df2)
>>    | a b
>>  0 | 0 1 1
>>  1 | 1 2 1

I want to write as csv without index.(like df.to_csv('test_file.csv',index=False)

How should I do? Please advice.


Solution

  • Can you try the following?

    import pandas as pd
    from io import BytesIO
    df = pd.DataFrame({"a":[1,2],"b":1})
    df.to_csv('text.csv', index = False)
    !gsutil cp 'text.csv' 'gs://path-to-your-bucket/test.csv'
    %gcs read --object gs://path-to-your-bucket/test.csv --variable test_file
    df2 = pd.read_csv(BytesIO(test_file))