Search code examples
phpiconv

Change iconv() replacement character


I am using iconv() to replace characters for an api request like this:

$text = iconv("UTF-8", "ASCII//TRANSLIT", $text);

This works great, however when an unknown character is encountered it gets replaced by a ?. Is there any straightforward way to change the substitution character to something else? Say a space? I know about the setting for mb functions mb_substitute_character() - but this doesn't apply to iconv().

Example:

$text = '? € é î ü π ∑ ñ';
echo iconv("UTF-8", "ASCII//TRANSLIT", $text), PHP_EOL;

Output:

? EUR e i u ? ? n

Desired Output:

? EUR e i u     n

Solution

  • AFAIK there's no translit function that lets you specify your own replacement character, but you can work around it by implementing your own simple escaping.

    function my_translit($text, $replace='!', $in_charset='UTF-8', $out_charset='ASCII//TRANSLIT') {
        // escape existing ?
        $res = str_replace(['\\', '?'], ['\\\\', '\\?'], $text);
        // translit
        $res = iconv($in_charset, $out_charset, $res);
        // replace unescaped ?
        $res = preg_replace('/(?<!\\\\)\\?/', $replace, $res);
        // unescape
        return str_replace(['\\\\', '\\?'], ['\\', '?'], $res);
    }
    
    $text = '? € é î ü π ∑ ñ \\? \\\\? \\\\\\?';
    var_dump(
        $text,
        my_translit($text)
    );
    

    Result:

    string(36) "? € é î ü π ∑ ñ \? \\? \\\?"
    string(29) "? EUR ! ! ! p ! ! \? \\? \\\?"
    

    I'm not certain why the transliteration output is different on my system, but the character replacement works.