The following URL works as expected and returns "null".
https://zga2tn1wgd.execute-api.us-east-1.amazonaws.com/mycall?url=https://mr.wikipedia.org/s/4jp4
But the same page, with unicode string instead of ascii string, throws an error:
"errorMessage": "'ascii' codec can't encode characters in position 10-20: ordinal not in range(128)", "errorType": "UnicodeEncodeError"
How do I encode the unicode characters while passing the string to API gateway?
I am using following bookmarklet to generate the URL mentioned above...
javascript:(function(){location.href='https://z3nt6lcj40.execute-api.us-east-1.amazonaws.com/mycall?url='+encodeURIComponent(location.href);})();
There is this line in your lambda function that unquotes the URL
url1 = urllib.parse.unquote(url)
from
'https://zga2tn1wgd.execute-api.us-east-1.amazonaws.com/mycall?url=https://mr.wikipedia.org/wiki/%E0%A4%95%E0%A4%BF%E0%A4%B6%E0%A5%8B%E0%A4%B0%E0%A4%BE%E0%A4%B5%E0%A4%B8%E0%A5%8D%E0%A4%A5%E0%A4%BE'
to
'https://zga2tn1wgd.execute-api.us-east-1.amazonaws.com/mycall?url=https://mr.wikipedia.org/wiki/किशोरावस्था'
The non US-ASCII parts of the above results has to be encoded before performing the request. This is in the query component.
It is recommended to separate URI into its components when encoding it to keep from changing its semantics.
Here is some more things to do before making request to the URL.
url1 = urllib.parse.unquote(url)
urlparts = urllib.parse.urlparse(url1)
querypart = urllib.parse.parse_qs(urlparts.query)
querypart_enc = urllib.parse.urlencode(querypart)
# Rebuild URL with escaped query part
url1 = urllib.parse.urlunparse((
urlparts.scheme, urlparts.netloc,
urlparts.path, urlparts.params,
querypart_enc, urlparts.fragment
))