I'm trying to upload files from my Datalab instance within the notebook itself to my Google Storage Bucket using the Python API but I'm unable to figure it out. The code example provided by Google in its documentation doesn't seem to work in Datalab. I'm currently using the gsutil command but would like to understand how to do this in using the Python API.
File Directory (I want to upload the python files located in the checkpoints folder):
!ls -R
.:
checkpoints README.md tpot_model.ipynb
./checkpoints:
pipeline_2020.02.29_00-22-17.py pipeline_2020.02.29_06-33-25.py
pipeline_2020.02.29_00-58-04.py pipeline_2020.02.29_07-13-35.py
pipeline_2020.02.29_02-00-52.py pipeline_2020.02.29_08-45-23.py
pipeline_2020.02.29_02-31-57.py pipeline_2020.02.29_09-16-41.py
pipeline_2020.02.29_03-02-51.py pipeline_2020.02.29_11-13-00.py
pipeline_2020.02.29_05-01-17.py
Current Code:
import google.datalab.storage as storage
from pathlib import Path
bucket = storage.Bucket('machine_learning_data_bucket')
for file in Path('').rglob('*.py'):
# API CODE GOES HERE
Current Working Solution:
!gsutil cp checkpoints/*.py gs://machine_learning_data_bucket
This is the code that worked for me:
from google.cloud import storage
from pathlib import Path
storage_client = storage.Client()
bucket = storage_client.bucket('bucket')
for file in Path('/home/jupyter/folder').rglob('*.py'):
blob = bucket.blob(file.name)
blob.upload_from_filename(str(file))
print("File {} uploaded to {}.".format(file.name,bucket.name))
Output:
File file1.py uploaded to bucket.
File file2.py uploaded to bucket.
File file3.py uploaded to bucket.
Or you can use:
import google.datalab.storage as storage
from pathlib import Path
bucket = storage.Bucket('bucket')
for file in Path('/home/jupyter/folder').rglob('*.py'):
blob = bucket.object(file.name)
blob.write_stream(file.read_text(), 'text/plain')
print("File {} uploaded to {}.".format(file.name,bucket.name))
Output:
File file1.py uploaded to bucket.
File file2.py uploaded to bucket.
File file3.py uploaded to bucket.