Search code examples
phputf-8preg-replaceasciinon-ascii-characters

How to truncate non-ascii characters from string using PHP


I have a following string as a Filename

$string = 'recyclage plétre francin.jpg';

and tried with following code

echo preg_replace('/[^a-z0-9|^.]/i', '_', iconv("UTF-8","ISO-8859-1//TRANSLIT",$string));

as there is a special (non-ascii) character in filename it creates junk character while working with file uploading using PHP.

What I want is that replace any unicode (non-ascii) character with specific Ascii character. I want to keep all supported Ascii characters and remove non-ascii characters. I also want to keep / or \ slashes because of directory separators in filename where a root path will be given.

Edit: (below is not solved)

I am having a issue with recyclage plƒtre francin.JPG please the f character which displays output like recyclage pl and it had truncated .JPG. Actually file name was recyclage plâtre francin and when I was debugging it has shown recyclage plƒtre francin.JPG and rest is written just after that. Any Idea?

When I am trying to convert tri et recyclage du plâtre but when at the reading it shows tri et recyclage du plâtre and after conversion it shows tri et recyclage du pl^atre.

Any help will be appreciated.


Solution

  • Here is a solution to my question. Finally I could able to see the conversion. Some Unicode characters are replaced with some Ascii characters. But after all everything is now working fine.

    function toASCII($str)
    {
        $accent   = 'ŠŒŽšœžŸ¥µÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûýýþÿŔŕƒ';
        $noaccent = 'SOZsozYYuaaaaaaaceeeeiiiidnoooooouuuuybsaaaaaaaceeeeiiiidnoooooouuuyybyRra';
        $string = strtr(utf8_decode($string),utf8_decode($accent),$noaccent);
        return strtr($string, $accent, $noaccent);
    }