Search code examples
python-2.7utf-8character-encodingconfluence-rest-api

python post request throws UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0'


I've python code that reads a page data from Confluence using the REST API and then using that data creates a new page in Confluence. While posting the data, the code throws the below error:

 UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 48108: ordinal not in range(128)

However, it works quite fine in Postman. My guess is that postman handles the encoding. I found many answers that suggested to use str.encode('utf-8') and str.encode('ascii', 'ignore') but none worked for me. I also tried:

   import sys
   reload(sys)
   sys.setdefaultencoding('utf-8')    

but in vain. If I use any of these to encode, then I get 500 Internal error while firing the post request.

Here's my python code: get_template_data gets the confluence page details and uses that in the post request to create a new page.

#5. Get the weekly status template page data
def get_template_data():
    url = 'https://confluence.abc.com/rest/api/content/11111111?expand=body.export_view'
    headers['content-type'] = "application/json"    
    r = requests.get(url, headers=headers)
    data = r.json()
    template_data=data['body']['export_view']['value']
    return template_data

#6. Using weekly status template page(5) & string(2), create (post) a new page in Confluence.
def create_weekly_status_page(title,template_data):
    post_data_prefix = """{"type":"page","title":"%s", "ancestors":[{"id":222222}], "space":{"key":"TST"},"body":{"storage":{"value":"%s","representation":"storage"}}}"""

    #template_data1 = u' '.join(template_data).encode('utf-8')
    template_data = template_data.replace('"', '\\"')
    post_body_str = post_data_prefix % (title, template_data)

    url = 'https://confluence.abc.com/rest/api/content/'
    headers['content-type'] = "application/json"

    r = requests.post(url, headers=headers, data=post_body_str)
    print r.status_code
    r.raise_for_status()

def main():
    #5. Get the weekly status template page data
    template_data = get_template_data()
    #print "template_data:", template_data

    #6. Using weekly status template page(5) & string(2), create (post) a new page in Confluence.
    # The weekly status string (2) will serve as title of the page
    weekly_status_str = "Weekly Report (01/01/17 - 01/05/17)
    create_weekly_status_page(weekly_status_str,template_data)

The culprit line is r = requests.post(url, headers=headers, data=post_body_str). So it's likely the data in the post_body_str has something to do with this.

The below is the stracktrace:

Traceback (most recent call last):
  File "new_status.py", line 216, in <module>
    main()
  File "new_status.py", line 193, in main
    create_weekly_status_page(weekly_status_str,template_data)
  File "new_status.py", line 79, in create_weekly_status_page
    url = 'https://confluence.abc.com/rest/api/content/'
  File "/Library/Python/2.7/site-packages/requests/api.py", line 110, in post
    return request('post', url, data=data, json=json, **kwargs)
  File "/Library/Python/2.7/site-packages/requests/api.py", line 56, in request
    return session.request(method=method, url=url, **kwargs)
  File "/Library/Python/2.7/site-packages/requests/sessions.py", line 488, in request
    resp = self.send(prep, **send_kwargs)
  File "/Library/Python/2.7/site-packages/requests/sessions.py", line 609, in send
    r = adapter.send(request, **kwargs)
  File "/Library/Python/2.7/site-packages/requests/adapters.py", line 423, in send
    timeout=timeout
  File "/Library/Python/2.7/site-packages/requests/packages/urllib3/connectionpool.py", line 594, in urlopen
    chunked=chunked)
  File "/Library/Python/2.7/site-packages/requests/packages/urllib3/connectionpool.py", line 361, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1053, in request
    self._send_request(method, url, body, headers)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1093, in _send_request
    self.endheaders(body)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1049, in endheaders
    self._send_output(message_body)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 897, in _send_output
    self.send(message_body)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 869, in send
    self.sock.sendall(data)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ssl.py", line 721, in sendall
    v = self.send(data[count:])
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ssl.py", line 687, in send
    v = self._sslobj.write(data)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 52926: ordinal not in range(128)

Also, I'm using body.export_view while getting template data because that page contains macros and I don't want macros to get copied but rather the results of the macros to get copied. Hence using body.export_view.

Am quite new to Python. And writing this to automate a few stuffs and learn Python alongside. Will appreciate some help/pointers.

Python Version: 2.7.10


Solution

  • I finally managed to get this working with the help of these two lines:

    template_data = template_data.encode('ascii', 'ignore') and
    template_data = template_data.replace('\n', '\\n')

    The first line fixes the UnicodeEncodeError while the second line fixes the 500 Internal Error.