Search code examples
google-drive-apigoogle-api-python-client

Correct usage of resumable upload with the python client library


I am a bit confused about the resumable upload to google drive and I am hoping if someone could be kind enough to clarify things a bit.

At this page: https://developers.google.com/api-client-library/python/guide/media_upload
it states:

For large media files, you can use resumable media uploads to send files, which allows files to be uploaded in smaller chunks.

Describes also the method of doing so using next_chunk(), checking for errors and use of expotential retrying.

All other references to uploading, either inserting or updating a file, are using "resumable=True" but do not implement the "next_chunk" function. Like in this page: https://developers.google.com/drive/v2/reference/files/insert#examples

Does this mean that "resumable" is handled by the library?
If not, in case of errors, are those the same as in the previous example (with next_chunk)?
If my app is supposed to catch the errors then the only way to go is to start uploading from the start since there is no return for success bytes or something else. Is this the right way?

Also at this page: https://developers.google.com/drive/manage-uploads
it states:

With resumable uploads, you can break a file into chunks and send a series of requests to upload each chunk in sequence. This is not the preferred approach since there are performance costs associated with the additional requests, and it is generally not needed.

Which one of those two statements is correct?

Thanks in advance for any input.
Andreas


Solution

  • Andreas,

    I believe the service.files.insert().execute() with the resumable=true property does something similar to the manual next_chunk on the example you posted... I'm not sure how exactly it handles it, because I couldn't find a way to read the source (I'm just starting with python), but if I interrupt the upload of a large file using the .insert().execute() method with resumable=true, one of the lines of the output is this:

    File "/usr/lib/python2.5/site-packages/apiclient/http.py", line 656, in execute
      _, body = self.next_chunk(http=http)
    

    However, I couldn't find a way to get a progress indicator using this method, so I preferred to use the manual request.next_chunk(), instead.

    About the performance costs of resumable upload, there is extra information being sent, but I don't think it's gonna slow too much the process... you can use bigger chunksizes (some MiB), so that the extra requests bytes are negligible. The SDK documentation does argue in favor of resumable upload in some circunstances:

    "To upload data files more reliably, you can use the resumable upload protocol. This protocol allows you to resume an upload operation after a communication failure has interrupted the flow of data. It is especially useful if you are transferring large files and the likelihood of a network interruption or some other transmission failure is high, for example, when uploading from a mobile client app. It can also reduce your bandwidth usage in the event of network failures because you don't have to restart large file uploads from the beginning."