Search code examples
phpcakephpphp-ziparchive

Special characters issue while extracting zip folder using ZipArchive in php


My file name in the Zip folder is "Norge språk.pdf" and while extracting the zip folder, I'm getting the file name as "Norge spr†k.pdf".

zip = new ZipArchive;
if ($zip->open($path, ZIPARCHIVE::CREATE) === true) 
{
    if(!file_exists(WWW_ROOT."/excel/".$name))
    {
        mkdir(WWW_ROOT."/excel/".$name, 0777);
    }
    for($i = 0; $i < $zip->numFiles; $i++) 
    {
        $fileinfo = pathinfo($zip->getNameIndex($i, ZIPARCHIVE::FL_UNCHANGED));

        copy("zip://".$path."#".htmlentities($test, ENT_COMPAT, 'ISO-8859-1'), WWW_ROOT."/excel/".$name.'/'.htmlentities($fileinfo['basename'], ENT_COMPAT, 'ISO-8859-1'));
    }                   
    $zip->close();                   
}

Anyone please help me on this issue.


Solution

  • I don't know about Cake PHP but the real issue is with ZIP. The problem is that the data of the files zipped, is not treated as binary data, as it should. This may root in your own file/variable handling. php is loose typed, what means that the type of a variable is selected by the php engine automatically.

    The other thing is the treatment of the file names. These names are character data and stored in a zip-file as that, with no information on the encoding.

    So the only thing you can use for sure is 7-Bit-ASCII. But as the ISO-Latin-1 code table is used wide spread (and contains all scandinavian special character), the problems you face tends to be caused by automatic conversions, too, as a filename typed on your own computer shouldn't lead to any difference when again displayed there.

    A solution which is provided to work around:

     $zip->addFile($file_data['path'], iconv("UTF-8","CP852",$file_name));
    

    use CP852 as your character (other encoding) or encoding or use

     system('unzip -o ' . $file);
    

    To view the real issue kindly read php zip contents encoding