Search code examples
windowsfor-loopbatch-filecmdyoutube-dl

Using youtube-dl in a Windows CMD FOR loop strips non ASCII characters


Running the command with youtube-dl directly in CMD works as expected:

youtube-dl -e "https://www.youtube.com/watch?v=E_JXrNAxGzM"

It correctly gives the title of the Youtube video: 27/12/2016 晚間新聞 楊家駿直播睇手機

However if I use the same command within a Windows batch file using a FOR loop, the non ASCII characters are removed completely. The batch file code:

@ECHO OFF
FOR /F "delims=" %%i IN ('youtube-dl -e "https://www.youtube.com/watch?v=E_JXrNAxGzM"') DO (
ECHO %%i
)
PAUSE
EXIT

Only gives this result: 27/12/2016

As a test, I tried this:

set var=晚間新聞楊家駿直播睇手機

for %%i in (%var%) do (
echo %%i
)

Which works fine and echos the Chinese characters correctly, this leads me to believe it's not a Unicode problem in CMD, but somehow tied with youtube-dl.

However, I have been assured that it's not a youtube-dl problem.

Is there something I'm missing and any way to get this working?


Solution

  • The --encoding utf-8 switch appears to be working here with chcp 65001 (disclaimer: only tried under win10 v1909 using the non-legacy console with the NSimSun font, ymmv with other versions or settings).

    C:\etc>chcp 65001
    Active code page: 65001
    
    C:\etc>for /f "delims=" %i in ('youtube-dl --encoding utf-8 -e "https://www.youtube.com/watch?v=E_JXrNAxGzM"') do @echo %i
    27/12/2016 晚間新聞 楊家駿直播睇手機
    

    ________

    However, I have been assured that it's not a youtube-dl problem.

    The real question to ask the dev is whether youtube-dl does any detection of the output stream being sent to the interactive console vs. being piped or redirected, and whether it changes the output encoding based on that detection. I believe the answer to that might be yes, which would explain the difference between direct console output vs. the for loop.