Search code examples
phpencoding

PHP rawurlencode function encodes differently than the browser


When I enter the string DIN 2080 Fräseraufnahmen SK 30 into the browser, it encodes it as DIN%202080%20Fr%C3%A4seraufnahmen%20SK%2030, and the API properly reads this. However, in a PHP script to call the same API, when I encode the string using the function rawurlencode(), it is encoded as DIN%202080%20Fra%CC%88seraufnahmen%20SK%2030. This causes the same API to misidentify the string as not matching. How do I encode the string in PHP so that it will be same same as encoded by the browser?

Browser:

DIN%202080%20Fr%C3%A4seraufnahmen%20SK%2030

PHP rawurlencode():

DIN%202080%20Fra%CC%88seraufnahmen%20SK%2030

Addendum: After further checking, I found that the problem is caused by php composer's autoloader. However, Sitethief's answer fixes this.

test program:

<?php
require_once(__DIR__ . '/vendor/autoload.php');
echo '<pre>' . rawurlencode($_GET['a']) . '</pre>';
echo '<br />';
echo '<pre>' . rawurlencode(Normalizer::normalize($_GET['a'], Normalizer::FORM_C)) . '</pre>';

script called as /test.php?a=ä Output:

a%CC%88

%C3%A4


Solution

  • You can use the PHP Normalizer class to convert the string using the same normalization form as the browser uses. You do need to have the intl PHP extension installed for this to work though.

    This should be how this would work:

    $normalizedString = Normalizer::normalize($inputString, Normalizer::FORM_C);