Search code examples
javaunicodejvmwindows-11regional-settings

Java doesn't recognize Unicode character in path on Windows 11


Java isn't able to recognize Unicode characters with the Beta: Use Unicode UTF-8 for worldwide language support option enabled.
The path to my user folder is C:\Users\Otávio Augusto Silva, and the á character is causing some trouble for java. By calling the javac command if the JDK is installed inside my user folder using scoop install, it gives the following result:

Erro: Não é possível carregar a classe principal com.sun.tools.javac.Main no módulo jdk.compiler
        java.lang.UnsatisfiedLinkError: no jimage in system library path: C:\Users\Otávio Augusto Silva\scoop\apps\zulu-jdk\current\bin

Notice that it replaces the á character with á.
If installed globally by using scoop install -g, choco install or the default installer from any JDK distribution, the commands works fine, but if I call and pass the whole path, it gives an error:

C:\Users\Otávio Augusto Silva>javac "C:\Users\Otávio Augusto Silva\Documents\Code\Java\Hello World\main.java"
error: file not found: C:\Users\Otávio Augusto Silva\Documents\Code\Java\Hello World\main.java
Usage: javac <options> <source files>
use --help for a list of possible options

To reproduce, do the following:

  • Have a user folder with a Unicode latin character (something like á, é, ã, etc.)
  • Have the Beta: Use Unicode UTF-8 for worldwide language support in region settings enabled
  • Install your favorite JDK distribution
  • Call javac passing the whole path like C:\Users\USERFOLER\PATH\TO\FILE\file.java

The error should appear.
I've been stuck for days in this, if anyone can help me it will be greatly appreciated.
Some relevant info:

  • I'm using cmd in Windows Terminal app, but PowerShell gives the same error
  • The chcp command gives the code 65001
  • I already tried the solution presented here, didn't work

Solution

  • Using your directory name (Otávio Augusto Silva), I can reproduce your problem on Windows 10 as well, using Java 18. Unfortunately, this looks like a specific example of a more general and longstanding problem documented in this open and unresolved JDK bug:

    JDK-4488646 Java executable and System properties need to support Unicode on Windows

    This is part of the bug report's description, with my emphasis added:

    To make Java completely Unicode-aware on NT we need to

    1. Modify System properties initialization code and all other places where Windows calls are used to use wide-char calls on NT.

    2. Modify java, javac etc. to be able to use Unicode in classpath and other command line arguments.

    That bug report was created in 2001! It relates to Windows NT, but since it remains open and unresolved I assume it has general applicability for all flavors of Windows, including Windows 10 and 11.

    Notes:

    • Although it doesn't help to resolve your specific problem, it is fairly straightforward "to use wide-char calls" within your Java application (as mentioned in the bug description above) using JNA. For example, your code could successfully process Otávio Augusto Silva if it was passed an argument to your application from Java. See this SO answer for the code to do that.

    • Also see open and unresolved JDK bug report JDK-8124977 cmdline encoding challenges on Windows which was raised in 2015. It includes some discussion on the differences between using java from cmd and PowerShell windows on Windows.

    ========================================================

    (This update is based on comments from @user16320675.)

    It seems the issue is fully resolved in Java 19 (download from here) which is due to be released later this month. From the screen shot below:

    • The call to javac will succeed when using JDK 19.

    • The same call to javac will fail when using JDK 18, because the file name D:\Otávio... is processed as D:\Otávio....

      javac calls

    I can't find any mention of this fix in the JDK 19 Release Notes.

    ========================================================

    (This update shows what happens if the beta option is not enabled.)

    If the option Beta: Use Unicode UTF-8 for worldwide language support is not enabled I cannot reproduce the problem; the call to javac works fine using both JDK 18 and JDK 19:

    Beta option not enabled

    Note that this works even though the code page in the cmd window is 437, not 65001. Of course there are a couple of significant differences between your environment and mine:

    • You are using Windows 11 and I am using Windows 10.
    • My system locale is English (United States), and I assume that yours is different.

    To summarize how to resolve this issue:

    • Unless you have that beta option enabled for some specific reason, consider just disabling it.
    • If you want to keep the option enabled, consider upgrading to Java 19.

    ========================================================

    Update: The following bug was fixed in Java 19:

    8272352: Java launcher can not parse Chinese character when system locale is set to UTF-8 #530

    Although that bug fix specifically relates to file names passed to java, I think it probably explains why the OP's problem with javac is also resolved in Java 19.