Search code examples
pythonurlliburllib3

file upload using urllib3 UnicodeDecodeError


I'm trying to upload a file via a multipart form POST request using urllib3. I followed this example from the urllib docs:

>>> with open('example.txt') as fp:
...     file_data = fp.read()
>>> r = http.request(
...     'POST',
...     'http://httpbin.org/post',
...     fields={
...         'filefield': ('example.txt', file_data),
...     })
>>> json.loads(r.data.decode('utf-8'))['files']
{'filefield': '...'}

When I adapted the example code, I added some extra fields required by the API I'm uploading to:

import urllib3

http = urllib3.PoolManager()

with open('/Volumes/GoogleDrive/My Drive/Code/Fuse-Qu/qu/uploads/risk.pdf') as fp:
    file_data = fp.read()

r = http.request(
    'POST',
    'https://***.fuseuniversal.com/api/v4.2/contents/media?auth_token=***',
    fields={
        "name": "test api upload 11",
        "description": "this is a test of uploading a pdf via the api",
        "composite_attributes[type]": "File",
        "community_ids": "24827",
        "composite_attributes[file]": ('risk.pdf', file_data, 'document/pdf'),
    })

However I end up getting this error:

Traceback (most recent call last):
  File "test-urllib.py", line 6, in <module>
    file_data = fp.read()
  File "/Users/dunc/.pyenv/versions/3.8.1/lib/python3.8/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe2 in position 10: invalid continuation byte

Solution

  • You need to open the file in binary mode because it's not text. If you open the file without specifying binary, python3 automatically tries to decode the contents as utf-8. Here is the failing lines updated:

    with open('/Volumes/GoogleDrive/My Drive/Code/Fuse-Qu/qu/uploads/risk.pdf', 'rb') as fp:
        file_data = fp.read()