Search code examples
javarubyutf-8utf8-decode

How to decode a path and open related file?


I'm trying to parse an iTunes XML library file on Mac OS.

In the iTunes file I have this string:

<key>Location</key>
<string>file://localhost/home/user/Downloads/album/01%C4%2024%5F7%20%28Intro%29%20&#38;.mp3</string>

As you can see it appears to be URL encoded. Now, the next step is to find and read this file on the filesystem, but it seems I cannot find a proper way to do that with Java or Ruby.

My main goal is to do that in Java, but for the sake of simplicity I tried this in Ruby:

2.0.0p247 :003 > URI.decode(str)
 => "file://localhost/home/user/Downloads/album/01\xC4 24_7 (Intro) &#38;.mp3" 
2.0.0p247 :004 > CGI.unescape(str)
 => "file://localhost/home/user/Downloads/album/01\xC4 24_7 (Intro) &#38;.mp3" 

The file appears in the file manager as /home/user/Downloads/album/01Ä 24_7 (Intro) &.mp3

Java:

  path = URLDecoder.decode( originalPath, "ISO-8859-1" );
  --
  path = URLDecoder.decode( originalPath, "UTF-8" );
  --
  path = new java.net.URI(originalPath).getPath();

But I cannot find the file using the resulting path in code. exists() or isFile() are always false although in the debugger the path shows up correctly.

How should I proceed?

Sadly I'm not fetching a file name from the OS, but from an XML file. Am I hitting a long standing JVM bug?

Setting the JVM to a specific locale is not acceptable, because I don't need to open only files in the user's locale.


Solution

  • Solved with LANG env variable.