Search code examples
pythondjangounicodeencodinghttplib

Django/httplib : transmitting request.raw_post_data with httplib


"AAaarg" ! Please HeLp !!!

Here is what I am trying to do ...

I have a Django site site1, which needs to access the API of another service site2. However, in order to do that, site1 needs to use its own login credentials and stuff ...

Therefore I have written a small Django app, which duplicates urls of site2, but under the hood, uses httplib2 to transmit the request almost identically (just authenticating and stuff). It works great in most of cases and it actually used to work great before for all cases (I don't really know what broke it, possibly the update Python 2.6 -> 2.7).

In order to transmit the POST/PUT data as is, I get it with :

post_data = request.raw_post_data

And then send it with httplib2 :

response, content = c.request(
    url,
    method,
    post_data,
    headers=headers,
)

Problem occurs when posting multipart-data, containing binary data such as an image. httplib (on top of which httplib2 is built) when building the request string, tries to concatenate my post_data with some generated headers and stuff. And it seems like request.raw_post_data is string type while the generated stuff are unicode. Therefore, it tries to decode my post_data (which contains binary data) and freaks out !!!

c.f. httplib line 807 :

if isinstance(message_body, str):
    msg += message_body

Here is an extract of message_body (gotten with request.raw_post_data) :

'-----------------------------697585321193462802080194682\r\nContent-Disposition: form-data; name="_method"\r\n\r\nPUT\r\n-----------------------------697585321193462802080194682\r\nContent-Disposition: form-data; name="jpegPhoto"; filename="crap.jpg"\r\nContent-Type: image/jpeg\r\n\r\n\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01\x01\x01\x00H\x00H\x00\x00\xff\xfe\x00\x13Created with GIMP\xff\xdb\x00C\x00\x05\x03\x04\x04\x04\x03\x05\x04\x04\x04

Here is the content of msg :

u'POST /user/spiq/?username=spiq HTTP/1.1\r\nContent-Length: 40307\r\naccept-language: en-us,en;q=0.5\r\naccept-encoding: gzip, deflate\r\nhost: localhost:8000\r\naccept: application/json\r\nuser-agent: Mozilla/5.0 (X11; Linux x86_64; rv:5.0) Gecko/20100101 Firefox/5.0\r\naccept-charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7\r\nconnection: keep-alive\r\nreferer: \r\ncookie: csrftoken=d9a3e014e5e366ee435b27ae7fc122af; sessionid=d5492a8d640e346b8ca56fa87e5cc439\r\ncontent-type: multipart/form-data\r\n\r\n'

So basically it is doomed ...

Any idea how should I proceed ? Can I turn my post_data to unicode without decoding it ?


Solution

  • From this: http://bugs.python.org/issue11898#msg138059 it looks like you might be okay if you ensure that the url argument is coerced into a str.