Search code examples
pythonstringunicodeutf-16

Reading text file into variable, subsequent print() returns escape characters?


Been searching around on this one to no avail. I have a snippet where I want to read a text file into a variable in python so that I can refer to it later (specifically to kill a running process).

File is generated like this:

os.system('wmic process where ^(CommandLine like "pythonw%pycpoint%")get ProcessID > windowsPID.txt')

Resulting text file windowsPID.txt looks like this:

ProcessId
4076

My python snippet to read the file looks like this:

with open('windowsPID.txt') as f: print "In BuildLaunch, my PID is: " b = f.readlines() print b

print b outputs the following:

['\xff\xfeP\x00r\x00o\x00c\x00e\x00s\x00s\x00I\x00d\x00 \x00 \x00\r\x00\n', '\x004\x000\x007\x006\x00 \x00 \x00 \x00 \x00 \x00 \x00 \x00\r\x00\n', '\x00']

I can see the 4076, but why can't I get it to output properly? I just need that second line.

UPDATE

As mentioned by roippi, this can be fixed by forcing the file to open in unicode-16:

import codecs with codecs.open('windowsPID.txt', encoding='utf-16') as f:

All fixed!

-Chow


Solution

  • Python by default tries to open files using utf-8 encoding, but your file is otherwise encoded, so you get raw bytes output to your screen.

    \xff\xfe is the UTF-16 (LE) byte order mark. You need to open your file with the proper encoding.

    import codecs
    
    with codecs.open('windowsPID.txt', encoding='utf-16') as f: