Search code examples
phphttpurlutf-8url-encoding

Mimic browser URL encoding for Chinese characters?


If you go here: http://hdjob.bjx.com.cn/AdvanceSearch.shtml

And find in the source HTML:

<dd><a href="/SearchResult.aspx?workprovince=安徽" target="_blank">安徽</a></dd>

If you place your cursor over the link in Chrome or Firefox, or simply open it up, the URL would look like this:

http://hdjob.bjx.com.cn/SearchResult.aspx?workprovince=%B0%B2%BB%D5

So the Chinese characters 安徽 are URL encoded as %B0%B2%BB%D5 automatically by the browsers.

My question is how to mimic this in PHP?

I tried these:

echo urlencode("安徽"), PHP_EOL;
echo rawurlencode("安徽");

Which output:

%E5%AE%89%E5%BE%BD
%E5%AE%89%E5%BE%BD

However if you go to:

http://hdjob.bjx.com.cn/SearchResult.aspx?workprovince=%E5%AE%89%E5%BE%BD

It's simply the wrong page and the workprovince variable isn't decoded correctly at all.

Seems both Chrome and Firefox are encoding the Chinese characters in a different way than both urlencode() and rawurlencode()?

How to mimic their way of doing this in PHP then?


Solution

  •  echo urlencode(mb_convert_encoding('安徽', 'gb2312', 'utf-8')); //  %B0%B2%BB%D5
     echo urlencode('安徽'); // %E5%AE%89%E5%BE%BD