Search code examples
pythonencodinglistdir

Convert listdir() return to byte for decoding


I have a list of dir with name encoded in 'gbk', for example:

dirs
  |- b'\xb6\xb0'/
  |- b'\xc1\xb1'/
  |- b'\xc9\xdd'/

However, when I use os.listdir() is return a list of str, like below:

["b'\\xb6\\xb0'", "b'\\xc1\\xb1'", "b'\\xc9\\xdd'"]

How can I cast the string to byte and decode it to get the original characters? I tried str.encode.decode but still not working.

Thanks.


Solution

  • You've used the string representation of bytes objects as the name of your directories, instead of creating directories using the encoded byte string as the name. To undo the plunder you could in this particular case use ast.literal_eval() to evaluate the string representation and then decode the resulting bytes object:

    import os
    import ast
    
    dirs = [ast.literal_eval(d).decode('gbk') for d in os.listdir(...)]
    

    Please note that ast.literal_eval() is used here only to recover the directory names and you should recreate them properly; in other words they should not have been created this way to begin with.