Search code examples
phpapacheyoutube-dl

Non ASCII Characters in filename are skipped


The program youtube-dl by itself supports Non ASCII characters in filename, It works flawlessly on my webserver under root user as well as www-data user, but when I try downloading a video using youtube-dl with PHP, the Non ASCII characters are completely skipped.

Eg: Stromae - bâtard will be saved as Stromae - btard.mp4 or البث الحي as .mp4

I am using this code to run the CLI command

function cmd($string) {
  $descriptorspec = array(
     0 => array("pipe", "r"),  // stdin
     1 => array("pipe", "w"),  // stdout
     2 => array("pipe", "w"),  // stderr
  );
  $process = proc_open($string, $descriptorspec, $pipes);
  $stdout = stream_get_contents($pipes[1]);
  fclose($pipes[1]);
  $stderr = stream_get_contents($pipes[2]);
  fclose($pipes[2]);
  $ret = proc_close($process);
  return $stdout;
  }
$value = ('youtube-dl https://some.valid/link');
echo cmd($value);

Kindly advise what I should do to fix this issue.


Solution

  • Check your phpinfo(); output for LC_ALL or LC_LANG settings. I suspect it has nothing to do with PHP, but with the shell environment that you're using versus the shell environment your web server is using.

    $value = ('LC_ALL=en_US.UTF-8 youtube-dl https://some.valid/link');
    echo cmd($value);