Search code examples
pythonwindowsunicodedirectory

Reading Japanese filenames in windows, using Python and glob not working


I just setup PortablePython on my system, so I can run python scripts from PHP and I got some very basic code (Below) to list all the files in a directory, however it doesn't work with Japanese filenames. It works fine with English filenames, but it spits out errors (Below) when I put any file containing Japanese characters in the directory.

import os, glob

path = 'G:\path'
for infile in glob.glob( os.path.join(path, '*') ):
    print("current file is: ", infile)

It works fine using 'PyScripter-Portable.exe', however when I try to run 'PortablePython\App\python.exe "test.py"' in the command prompt or from PHP it spits out the following errors:

current file is:  Traceback (most recent call last):
  File "test.py", line 5, in <module>
    print("current file is: ", infile)
  File "PortablePython\App\lib\io.py", line 1494, in write
    b = encoder.encode(s)
  File "PortablePython\App\lib\encodings\cp437.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 37-40: character maps to <undefined>



I'm very new to Python and am just using this to get around a PHP issue with not being able to read unicode filenames in Windows... So I really need this to work - any help you can give me would be great.


Solution

  • Assuming you're using python 2.x, try changing your strings to unicode, like this:

    path = u'G:\path'
    for infile in glob.glob( os.path.join(path, u'*') ):
        print( u"current file is: ", infile)
    

    That should let python's filesystem-related functions know that you want to work with unicode file names.