Search code examples
indydelphi-10.2-tokyo

What special handling do we need if we are trying to execute GET in TIdHTTP with Chinese words as parameters


I'm trying to GET the following URL in Delphi using TIdHTTO but it fails:

https://api.sketchengine.eu/bonito/run.cgi/wsketch?lemma=上海&username=agofpos&api_key=ce39ab3f07544e759b068338ac1974e2&corpname=preloaded/zhtenten17_simplified_stf2&format=json

Using POSTMAN is OK, though.

Here is my code:

url := 'https://api.sketchengine.eu/bonito/run.cgi/wsketch?lemma=上海&username=agofpos&api_key=ce39ab3f07544e759b068338ac1974e2&corpname=preloaded/zhtenten17_simplified_stf2&format=json';
memoResult.Text := idHttp.Get(url);

Do we need special handling when dealing with non-English characters during GET? If so, how?

I'm using Delphi 10.2, if that helps.


Solution

  • URLs do not allow non-ASCII characters (IRIs do, but those are not widely in use yet). In a URL, you must url-encode any non-ASCII characters (and other reserved) characters, such as with Indy's TIdURI class, eg:

    uses
      ..., IdURI;
    
    url := 'https://api.sketchengine.eu/bonito/run.cgi/wsketch?lemma=' + TIdURI.ParamsEncode('上海') + '&username=agofpos&api_key=ce39ab3f07544e759b068338ac1974e2&corpname=preloaded/zhtenten17_simplified_stf2&format=json';
    memoResult.Text := idHttp.Get(url);
    

    This will encode the Chinese characters to UTF-8 and then encode those bytes in %HH format, eg:

    https://api.sketchengine.eu/bonito/run.cgi/wsketch?lemma=%E4%B8%8A%E6%B5%B7&username=agofpos&api_key=ce39ab3f07544e759b068338ac1974e2&corpname=preloaded/zhtenten17_simplified_stf2&format=json

    Notice how the 上海 became %E4%B8%8A%E6%B5%B7.

    Web browsers, POSTMAN, etc handle this internally for you. You can confirm this by looking at the raw HTTP request that is actually being transmitted.