Search code examples
jsonpython-3.xgoogle-cloud-storage

How to Read .json file in python code from google cloud storage bucket


I'm trying to read a .json file as dict() in a python code from VM instance stored in google cloud storage bucket.

I tried reading json file as blob:

client = storage.Client()
bucket = client.get_bucket('bucket-id-here')
blob = bucket.get_blob('remote/path/to/file.json')
str_json = blob.download_as_string()

But I'm unable to decode the str_json. Is my approach correct? if any other approach available please let me know.

I need something like:

# Method to load json
dict = load_json(gcs_path='gs://bucket_name/filename.json')

Solution

  • This method using GCS File System gcsfs an be used read files from Google Cloud storage.

    # Reading gcs files with gcsfs
    import gcsfs
    import json
    
    gcs_file_system = gcsfs.GCSFileSystem(project="gcp_project_name")
    gcs_json_path = "gs://bucket_name/path/to/file.json"
    with gcs_file_system.open(gcs_json_path) as f:
      json_dict = json.load(f)
    

    This method also works for images stored in GCS with skimage as,

    from skimage import io
    with gcs_file_system.open(gcs_img_path) as f:
      img = io.imread(f)