Search code examples
phputf-8character-encodingspecial-characterscyrillic

Change cyril to western characters (both in utf-8) - php


I've got a problem with cyril characters. Our site is in utf-8 and accepts these characters, however our external sources where we pass on characters does not accept them. We are trying to pass on persons names. They have other different charsets (for example iso-8859-1).

Is there any easy way in PHP to convert manually each cyril character into it's equivalent western character first while in utf-8 to avoid the non-proper conversion? If using the conversion methods directly it all gets into question marks.


Solution

  • I found a solution which resolved my problem. The function below converts each character from Cyrillic to it's equivalent Western characters. The names get quite accurately converted using this function.

    When characters where replace I could use the utf8_decode() to convert it to iso-8859-1.

    function do_translit($st) {
        $replacement = array(
            "й"=>"i","ц"=>"c","у"=>"u","к"=>"k","е"=>"e","н"=>"n",
            "г"=>"g","ш"=>"sh","щ"=>"sh","з"=>"z","х"=>"x","ъ"=>"\'",
            "ф"=>"f","ы"=>"i","в"=>"v","а"=>"a","п"=>"p","р"=>"r",
            "о"=>"o","л"=>"l","д"=>"d","ж"=>"zh","э"=>"ie","ё"=>"e",
            "я"=>"ya","ч"=>"ch","с"=>"c","м"=>"m","и"=>"i","т"=>"t",
            "ь"=>"\'","б"=>"b","ю"=>"yu",
            "Й"=>"I","Ц"=>"C","У"=>"U","К"=>"K","Е"=>"E","Н"=>"N",
            "Г"=>"G","Ш"=>"SH","Щ"=>"SH","З"=>"Z","Х"=>"X","Ъ"=>"\'",
            "Ф"=>"F","Ы"=>"I","В"=>"V","А"=>"A","П"=>"P","Р"=>"R",
            "О"=>"O","Л"=>"L","Д"=>"D","Ж"=>"ZH","Э"=>"IE","Ё"=>"E",
            "Я"=>"YA","Ч"=>"CH","С"=>"C","М"=>"M","И"=>"I","Т"=>"T",
            "Ь"=>"\'","Б"=>"B","Ю"=>"YU",
        );
    
        foreach($replacement as $i=>$u) {
            $st = mb_eregi_replace($i,$u,$st);
        }
        return $st;
        } 
    

    Reference: http://php.net/manual/en/function.mb-eregi-replace.php