I'm trying to GET the following URL in Delphi using TIdHTTO
but it fails:
https://api.sketchengine.eu/bonito/run.cgi/wsketch?lemma=上海&username=agofpos&api_key=ce39ab3f07544e759b068338ac1974e2&corpname=preloaded/zhtenten17_simplified_stf2&format=json
Using POSTMAN is OK, though.
Here is my code:
url := 'https://api.sketchengine.eu/bonito/run.cgi/wsketch?lemma=上海&username=agofpos&api_key=ce39ab3f07544e759b068338ac1974e2&corpname=preloaded/zhtenten17_simplified_stf2&format=json';
memoResult.Text := idHttp.Get(url);
Do we need special handling when dealing with non-English characters during GET? If so, how?
I'm using Delphi 10.2, if that helps.
URLs do not allow non-ASCII characters (IRIs do, but those are not widely in use yet). In a URL, you must url-encode any non-ASCII characters (and other reserved) characters, such as with Indy's TIdURI
class, eg:
uses
..., IdURI;
url := 'https://api.sketchengine.eu/bonito/run.cgi/wsketch?lemma=' + TIdURI.ParamsEncode('上海') + '&username=agofpos&api_key=ce39ab3f07544e759b068338ac1974e2&corpname=preloaded/zhtenten17_simplified_stf2&format=json';
memoResult.Text := idHttp.Get(url);
This will encode the Chinese characters to UTF-8 and then encode those bytes in %HH
format, eg:
Notice how the 上海
became %E4%B8%8A%E6%B5%B7
.
Web browsers, POSTMAN, etc handle this internally for you. You can confirm this by looking at the raw HTTP request that is actually being transmitted.