Search code examples
google-cloud-platformgoogle-cloud-storage

Writing file to GCS using resumable upload url


I'm generating resumable upload urls using service key impersonation. However I'm not able to use the url for writing the file.

I'm using a PUT request, but the file isn't being written into the bucket. This is the response I get:

curl -X PUT -T "temp.json" "https://storage.googleapis.com/upload/storage/v1/b/temp-bucket/o?uploadType=resumable&upload_id=ADPycdt5syFNnE7mhpI7-zDvsSBsAvggbN9OoRO0L3sxxxxxx"
{
  "kind": "storage#object",
  "id": "temp-bucket/temp.json/1653298451710685",
  "selfLink": "https://www.googleapis.com/storage/v1/b/temp-bucket/o/temp.json",
  "mediaLink": "https://storage.googleapis.com/download/storage/v1/b/temp-bucket/o/temp.json?generation=1653298451710685&alt=media",
  "name": "temp.json",
  "bucket": "temp-bucket",
  "generation": "1653298451710685",
  "metageneration": "1",
  "contentType": "application/json",
  "storageClass": "STANDARD",
  "size": "23",
  "md5Hash": "GlEmumKUMqtQEY9mx0+JRQ==",
  "crc32c": "4OdQJg==",
  "etag": "CN2FsNeo9fcCEAE=",
  "timeCreated": "2022-05-23T09:34:11.781Z",
  "updated": "2022-05-23T09:34:11.781Z",
  "timeStorageClassUpdated": "2022-05-23T09:34:11.781Z"
}

I'm trying to do it in Python as well, but can't find a way to upload the data when it's not a string.

import urllib3
import requests
from google.resumable_media.requests import SimpleUpload
import google.auth
import google.auth.transport.requests as tr_requests
import functions_framework
@functions_framework.http
def main(request):
    config = request.get_json(silent=True)
    UPLOAD_URL = config['UPLOAD_URL']
    upload = SimpleUpload(UPLOAD_URL)
    data = open('temp.json', 'rb').encode('utf-8')
    target_scopes = "https://www.googleapis.com/auth/devstorage.read_write"
    credentials, _ = google.auth.default(scopes = (target_scopes,))
    transport = tr_requests.AuthorizedSession(credentials)
    content_type = 'application/json'
    response = upload.transmit(transport, data, content_type)
    return response

Error :

File "/layers/google.python.pip/pip/lib/python3.9/site-packages/google/resumable_media/_upload.py", line 202, in _prepare_request raise TypeError("`data` must be bytes, received", type(data)) TypeError: ('`data` must be bytes, received', <class '_io.BufferedReader'>)

This is confusing, a normal PUT call is supposed to work right?

$ curl -i -X PUT --data-binary @temp.json \
> -H "Content-Length: 23" \
> "https://storage.googleapis.com/upload/storage/v1/b/temp-bucket/o?uploadType=resumable&upload_id=adpdodldldldldldldid"
HTTP/2 400
content-type: text/html; charset=UTF-8
referrer-policy: no-referrer
content-length: 1555
date: Mon, 23 May 2022 13:54:52 GMT
alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"
curl: (92) HTTP/2 stream 0 was not closed cleanly: PROTOCOL_ERROR (err 1)

Solution

  • You need to do multiple requests to upload a file with the resumable feature (documentation):

    • one request to create the resumable session:
    curl -i -X POST --data-binary @METADATA_LOCATION \
        -H "Authorization: Bearer OAUTH2_TOKEN" \
        -H "Content-Type: application/json" \
        -H "Content-Length: INITIAL_REQUEST_LENGTH" \
        "https://storage.googleapis.com/upload/storage/v1/b/BUCKET_NAME/o?uploadType=resumable&name=OBJECT_NAME"
    
    • one or multiple requests to upload the file:
    curl -i -X PUT --data-binary @OBJECT_LOCATION \
        -H "Content-Length: OBJECT_SIZE" \
        "SESSION_URI"