I have a list of Japanese Kanji and their pronunciations saved in a text file (JouyouKanjiReadings.txt) like this
亜 ア
哀 アイ,あわれ,あわれむ
愛 アイ
悪 アク,オ,わるい
握 アク,にぎる
圧 アツ
(each gap is made by pressing TAB)
and I have a script like this
@echo off
set /p text=Enter here:
echo %text%>Search.txt
echo.
findstr /G:"Search.txt" JouyouKanjiReadings.txt || echo No Results && pause > nul && exit
pause > nul
However, when I run the script, I always get "No Results". I tried with English characters and it worked fine. I also tried the same script with this
findstr "%text%" JouyouKanjiReadings.txt || echo No Results && pause > nul && exit
but got the same results. Is there any ways to get around this? Also, I'm displaying the these characters correctly in the command prompt by using
chcp 65001
and a different font.
You need to use find
(which supports Unicode but not regex) instead of findstr
(which supports regex but not Unicode). See Why are there both FIND and FINDSTR programs, with unrelated feature sets?
D:\kanji>chcp
Active code page: 65001
D:\kanji>find "哀" JouyouKanjiReadings.txt
---------- JOUYOUKANJIREADINGS.TXT
哀 アイ,あわれ,あわれむ
Redirect to NUL
to suppress the output if you don't need it
That said, find
isn't a good solution either. Nowadays you should use PowerShell instead of cmd with all of its quirks due to compatibility legacy issues. PowerShell fully supports Unicode and can run any .NET framework methods. To search for strings you can use the cmdlet Select-String
or its alias sls
PS D:\kanji> Select-String '握' JouyouKanjiReadings.txt
JouyouKanjiReadings.txt:5:握 アク,にぎる
If fact you don't even need to use UTF-8 and codepage 65001. Just store the file in UTF-16 with BOM (that'll result in a much smaller file because your file contains mostly Japanese characters), then find
and sls
will automatically do a search in UTF-16
Of course if there are a lot of existing batch code then you can call PowerShell from cmd like this
powershell -Command "Select-String '哀' JouyouKanjiReadings.txt"
But if it's entirely new then please just avoid the hassle and use PowerShell