Search code examples
pythonspecial-charactersgoogle-colaboratorylistdir

os.listdir return strange string of filename with special characters


Suppose I have the following files in path, which is in my Google drive that is connected to a Python 3 Colab notebook:

(Here, the # line represents the output)

ls = os.listdir(path)
print (ls)
# ['á.csv', 'b.csv']

Every seems ok, but if I write

'á.csv' in ls
# False

But should returns True. However, if I repeat the last code, but instead of writing 'á.csv' I copy-paste it manually from print (ls), it returns True.

Thanks

ps: The problem is not exactly with that filename, is with several filenames which contains special characters (namely í, á, é, ó, ñ)


Solution

  • You can normalize the file list before comparing them.

    from unicodedata import normalize
    ls = [normalize('NFC', f) for f in os.listdir(path)]
    # compare
    normalize('NFC', 'á.csv') in ls
    # or just 'á.csv' in ls