htmlspecialchars() appears to be translating special chars like the following: āķūņūķī into their respective entity number:
ā ķ ū ņ ū ķ ī
While some remain untranslated such as:
žš
I would like htmlspecialchars()
(or some other function) to not translate these alphabetical type of characters... So that it only translates the following (as it seems to indicate on the php.net manual):
The reason why I need this is because after a POST request, i am running this user input through htmlspecialchars()
before placing it back into a new set of html inputs. Characters such as &,",',<,>, need to be translated so to not cause display errors etc. But i need the special chars such as 'āķūņūķī' remain unchanged. Else the user will be very confused.
Set the third parameter as UTF-8
:
echo htmlentities('āķūņūķī', ENT_QUOTES, 'UTF-8');
The default encoding for htmlspecialchars
is ISO-8859-1
.
Test case:
var_dump(htmlentities('āķūņūķī'));
var_dump(htmlentities('āķūņūķī', ENT_QUOTES, 'UTF-8'));
Output:
string(84) "�ķū�ūķī"
string(14) "āķūņūķī"