Search code examples
fileutf-8luainvalid-argument

Opening Japanese-named files in Lua


I have bunch of XML files named in Japanese. I use Lua to read them and put the necessary informations into tables. I could open files named only in a single kanji like 名.xml, but for multiple kanjis like 名前.xml it was contrawise. Before I ran the Lua file, I set the command line's code page to 65001 (as UTF-8). And to read the files I need to encode the filename using WinAPI library from ACP (ASCII code page?) to UTF-8, but this encoding only works for the single kanjis. I've tried several suggestions across internet, using short path to the file, etc. but none of them worked. I tried to use the short path by running Lua as administrator--as stated in other similar question that you need administrator previleges to use the short path--but no luck.

...
for fn in io.popen("DIR xml /B /AA"):lines() do
    ...
    local f = assert(io.open("xml\\" .. winapi.encode(winapi.CP_UTF8, winapi.CP_ACP, fn), "rb"))
    ...
end
...

But my code produced "Invalid argument" error. I searched this error but none of them are Lua-related, so I opened the C/C++-related ones, but what I got was only 'use _wfopen' or something like that. It's not implemented in Lua and neither I want to implement it myself. So anyone have any idea how to solve this? For more information just be sure to let me know. Thanks!


Solution

  • I don't know why your program does not work, but try this workaround:

    local pipe = io.popen([[for %G in (xml\*) do @(type "%G" & echo @FILENAMEMARKER#%G)]], "rb")
    local all_files = pipe:read"*a"
    pipe:close()
    for filecontent, filename in all_files:gmatch"(.-)@FILENAMEMARKER#(.-)\r?\n" do
       -- process your file here
       print('===== This is your file name:')
       print(filename)
       print('== This is your file content:')
       print(filecontent)
       print('== End of file')
    end