Search code examples
phpfilecross-platformrenamingtransliteration

PHP Transliteration and renaming files


This is my problem. Files is not renaming. What I do wrong? What I do not see? This script is must to work in Windows and Unix. Script file in UNIX UTF-8 w/o BOM. Tryed Windows 1251, ANSI and still don't working.

 <?php
 function Transliteration($FileName){ 
 $CharReplace = array (
'А'=>'A', 'Б'=>'B', 'В'=>'V',
'Г'=>'G', 'Д'=>'D', 'Е'=>'E',
'Ё'=>'E', 'Ж'=>'ZH', 'З'=>'Z',
'И'=>'I', 'Й'=>'J', 'К'=>'K',
'Л'=>'L', 'М'=>'M', 'Н'=>'N',
'О'=>'O', 'П'=>'P', 'Р'=>'R',
'С'=>'S', 'Т'=>'T', 'У'=>'U',
'Ф'=>'F', 'Х'=>'H', 'Ц'=>'TS',
'Ч'=>'CH', 'Ш'=>'SH', 'Щ'=>'SHH',
'Ъ'=>'', 'Ы'=>'I', 'Ь'=>'',
'Э'=>'E', 'Ю'=>'YU', 'Я'=>'YA',
'а'=>'a', 'б'=>'b', 'в'=>'v',
'г'=>'g', 'д'=>'d', 'е'=>'e',
'ё'=>'yo', 'ж'=>'zh', 'з'=>'z',
'и'=>'i', 'й'=>'j', 'к'=>'k',
'л'=>'l', 'м'=>'m', 'н'=>'n',
'о'=>'o', 'п'=>'p', 'р'=>'r',
'с'=>'s', 'т'=>'t', 'у'=>'u',
'ф'=>'f', 'х'=>'h', 'ц'=>'ts',
'ч'=>'ch',  'ш'=>'sh', 'щ'=>'shh',
'ъ'=>'', 'ы'=>'i', 'ь'=>'',
'э'=>'e', 'ю'=>'yu', 'я'=>'ya',
"№"=>"N", " "=>"_", "–"=>"_",
"-"=>"_", " - "=>"_", ","=>"");
$FileNameTranslited = str_replace(array_keys($CharReplace), $CharReplace, $FileName);
return $FileNameTranslited;}

function Renaming(){
$WorkDir = opendir("ToRename") or die("Не могу открыть папку");
while ($CurrentFile = readdir($WorkDir)){
    if ($CurrentFile != "." && $CurrentFile != ".."){
        $TranslitedFile = Transliteration($CurrentFile);
        if (rename($CurrentFile, $TranslitedFile))
            {echo "File Renamed";}
            else{echo "Some shit happen!";}
        echo $CurrentFile." -> ".$TranslitedFile."<br>";}}}

 Renaming();
 ?>

Thanks a lot StathisG! This is a right key for solution. But it still not working. Look here:

 function Renaming(){
 $directory = 'ToRename/';
 $WorkDir = opendir($directory) or die("Не могу открыть папку");
 while ($CurrentFile = readdir($WorkDir)){
   if ($CurrentFile != "." && $CurrentFile != ".."){
    $WhichCodingWeWant = 'UTF-8';
    $FileNameCoding = mb_detect_encoding($CurrentFile);
    echo $FileNameCoding."<br/>";
    $utf8_filename = mb_convert_encoding($CurrentFile, $WhichCodingWeWant, $FileNameCoding);
    $TranslitedFile = Transliteration($utf8_filename);
    mb_convert_encoding($TranslitedFile, $FileNameCoding, $WhichCodingWeWant);
    echo mb_detect_encoding($TranslitedFile)."<br/>";
    if (rename($directory . $CurrentFile, $directory . $TranslitedFile)) {
       echo "File Renamed<br/>";
       } else {
         echo "Some shit happen!<br/>";
          }
        echo $utf8_filename." -> ".$TranslitedFile."<br>";
       }
    }
 }
   Renaming(); 

As you can see I add a new Vars "$WhichCodingWeWant" and "$FileNameCoding". Incoming file name: "Новый текстовый документ.txt" out "Íîâûé_òåêñòîâûé_äîêóìåíò.txt" mus be "Novij_textovij_document.txt" My brain is exploded...


Okay...step 3. Incoming data like before: Новый текстовый документ.txt

 function Renaming(){
 $directory = 'ToRename/';
 $WorkDir = opendir($directory) or die("Не могу открыть папку");
 while ($CurrentFile = readdir($WorkDir)){
    if ($CurrentFile != "." && $CurrentFile != ".."){
        echo "What name is come: ".$CurrentFile."<br/>";
        $WhichCodingWeWant = 'UTF-8';
        $FileNameCoding = mb_detect_encoding($CurrentFile);
        echo "File name encoding: ".$FileNameCoding."<br/>";

        $utf8_filename = mb_convert_encoding($CurrentFile, $WhichCodingWeWant, $FileNameCoding);
        echo "File name behind transliting: ".$utf8_filename."<br/>";
        $TranslitedFile = Transliteration($utf8_filename);
        echo "File name translited to: ".$TranslitedFile."<br/>";

        mb_convert_encoding($TranslitedFile, $FileNameCoding, $WhichCodingWeWant);
        echo "File name encoding converted to: ".mb_detect_encoding($TranslitedFile)."<br/>";

        if (rename($directory . $CurrentFile, $directory . $TranslitedFile)) {
            echo "File Renamed<br/>";
        } else {
            echo "Some shit happen!<br/>";
        }
        echo $utf8_filename." -> ".$TranslitedFile."<br>";
    }
 }
 }
 Renaming();

 Result is: 
 What name is come: Новый текстовый документ.txt
 File name encoding: UTF-8
 File name behind transliting: ????? ????????? ????????.txt
 File name translited to: ?????_?????????_????????.txt
 File name encoding converted to: ASCII

Warning: : No error in E:\WEB\XAMPP\htdocs\my\Site\test\test6.php on line 32 Some shit happen! ????? ????????? ????????.txt -> ??????????????????????.txt And file is not renamed in folder.

Why ASCII if I want and making UTF-8? I understand that a I nothing to understand! Any way Thank You StathisG for trying to help me! I'll try this script tomorrow in Linux system. And tell you about results. If you will have a some ideas about this all, I will glad to see it :)


Solution

  • Your code produces the following warning:

    Warning: rename(test.txt, test.txt): The system cannot find the file specified.

    The $CurrentFile variable holds only the filename and not the complete path of the file. Try the following:

    function Renaming(){
        $directory = 'ToRename/';
        $WorkDir = opendir($directory) or die("Не могу открыть папку");
        while ($CurrentFile = readdir($WorkDir)){
            if ($CurrentFile != "." && $CurrentFile != ".."){
                $utf8_filename = mb_convert_encoding($CurrentFile, 'UTF-8', 'GREEK');
                $TranslitedFile = Transliteration($utf8_filename);
                if (rename($directory . $CurrentFile, $directory . $TranslitedFile)) {
                    echo "File Renamed";
                } else {
                    echo "Some shit happen!";
                }
                echo $utf8_filename." -> ".$TranslitedFile."<br>";
            }
        }
    }
    Renaming();
    

    I tested your Transliteration variable separately, and it seems to working fine (see test below), so ignore my original comment about the multibyte string functions.

    echo Transliteration('Не могу открыть папку'); // produces 'Ne_mogu_otkrit_papku'

    EDIT:

    I edited the code above, adding the following line:

    $utf8_filename = mb_convert_encoding($CurrentFile, 'UTF-8', 'GREEK');

    Then, I used the $utf8_filename as the variable passed to your Transliteration function:

    $TranslitedFile = Transliteration($utf8_filename);

    As you may noticed, I used 'GREEK' as the filename's encoding, since that's the only language I know other than English, so I used Greek filenames to test your code.

    I created a file called "τεστ.txt", and added the following values to the $CharReplace array: 'τ'=>'t', 'ε'=>'e', 'σ'=>'s'

    When I run the code, I got the following message, and the file was renamed successfully to "test.txt".

    File Renamed τεστ.txt -> test.txt
    

    Based on the PHP manual, the supported encodings for mb_convert_encoding are these.

    So, try the above code, replacing the encoding value with the encoding which corresponds to the characters you use and check if that solves your problem.