Search code examples
phpencode

php encoding issue with pdf files


Using Ubuntu with php I'm facing a common problem, to which I haven't found any solution. I'm uploading a pdf file that I convert into text file (using ImgMagick + Tesseract).

    $output = shell_exec('convert -density 300 ' . $fichier . ' ' . $fichier_noExt . '.png');
    $output = shell_exec('tesseract ' . $fichier_noExt . '.png ' . $fichier_noExt . '.txt');

As I do this :

$file = fopen($fichier_txt.'.txt', 'r+');
echo $file;

I get some '°' instead of '°', '€ ' instead of '€' and 'é' instead of 'é'. I know it's an encoding issue, but I can't locate it.


Solution

  • Oh dear...

    I just forgot to add this on top of my file :

    header('Content-Type: text/html; charset=utf-8');
    

    It does work now, sorry for losing your time, but I needed some fresh look :).

    Have a nice day and cya !