Search code examples
windowscommand-linefontsraster

Junk character replaces the '/' in windows command prompt


I am facing a strange problem. I am seeing in windows command prompt, every '/' character is replaced by a junk character (yen symbol). I selected two font types "MS Gothic" and "Raster". But same problem. If I use Raster the problem is intermittent. Please let me know how to solve the problem.

Thanks, Naga


Solution

  • Type chcp at the command prompt, and I bet you'll see Active code page: 932

    The windows console has the concept of code pages, a relic of pre-unicode days, where the bytes 0-255 are mapped to different characters, depending upon the language. While the characters a-z, A-Z, 0-9 are consistent, lesser-used characters are mapped to characters popular in the target language.

    In code page 932, the backslash is mapped to the yen character.

    This is a common issue. See Microsoft's note on MSDN:

    Caution Windows code page and OEM code page character sets used on Japanese-language operating systems contain the Yen symbol (¥) instead of a backslash (). Thus, the Yen symbol is a prohibited character for NTFS and FAT file systems. When mapping Unicode to a Japanese-language code page, WideCharToMultiByte and other conversion functions map both backslash (U+005C) and the normal Unicode Yen symbol (U+00A5) to this same character. For security reasons, your applications should not typically allow the character U+00A5 in a Unicode string that might be converted for use as a FAT file name. For more information, see Security Considerations: International Features.

    UPDATE

    Sorry for the delay, it took me a bit to recall where I had originally read about this. The best reference is Mike Kaplan's weblog entry here. michkap is the best Microsoft blog for all things unicode. If you deal with charsets, encoding issues and the dark corners of internationalization, his blog is an essential reference.

    From his entry on the yen character as the backslash:

    ...on Japanese code page 932, 0x5c is the YEN SIGN, and on Korean code page 949, 0x5c is the WON SIGN.

    Which is not to say that 0x5c does not act as a path separator -- it still does. And which is also not to say that the Unicode code points for the Yen and the Won (U+00a5 and U+20a9) do act as path separators -- because they do not.

    ...

    In practice, after many years of code page based systems in Japan and Korea using their respective currency symbols as the path separators, it is believed customers were simply used to this appearance. And there was therefore little interest in changing that appearance (when the system settings were Japanese or Korean) to anything but those symbols.

    To support this expectation, Japanese and Korean fonts, whenever the default system locale is set to Japanese or Korean, respectively, will display the currency symbol rather than the backslash when U+005c is shown.

    You'll be hard pressed to find a better reference than that one, I believe.