Search code examples
python-3.xgoogle-cloud-storagegoogle-cloud-datalab

Google Datalab read from cloud storage


i know this question has been asked many times but all answers do not fit my request. I would like to retrieve a csv file that is stored into cloud storage from datalab. In order to re-use the code in a normal application i DO NOT want to use the datalab.storage library but the official cloud storage and without any magic.

Is it possible? Up to now I did:

from google.cloud import storage

client = storage.Client()
bucket = client.get_bucket(BUCKET_NAME)
blob = storage.Blob(gs_path, bucket)
# here I should put something equivalent to 
# data = data_obj.read_stream() if using datalab.storage
# %gcs read --object $uri --variable data if using magic

How to do using clean storage library? Thanks


Solution

  • Yes, this is possible. Assuming you want it saved to a file, you can use blob.download_to_filename()

    def download_blob(bucket_name, source_blob_name, destination_file_name):
        """Downloads a blob from the bucket."""
        storage_client = storage.Client()
        bucket = storage_client.get_bucket(bucket_name)
        blob = bucket.blob(source_blob_name)
    
        blob.download_to_filename(destination_file_name)
    
        print('Blob {} downloaded to {}.'.format(
            source_blob_name,
            destination_file_name))
    

    Other options like download_as_string() and download_to_file() are available as well.

    References: