I open my file thus:
with open(sourceFileName, 'r', encoding='ISO-8859-1') as sourceFile:
but, when I
previousLine = linecache.getline(sourceFileName, i - 1)
I get an exception
"UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb4 in position 169:
invalid start byte
This is because (I think) linecache.getline
returns a str() (which does not have a decode()
method).
My script must be able to support unicode, so I can't simply convert the input file to UTF-8.
linecache
takes a filename, not a file object, as your usage shows. It has no provision for an encoding. Also from the documentation:
This is used by the traceback module to retrieve source lines for inclusion in the formatted traceback.
This implies that it is mainly used for Python source code. As it turns out, if the file has a Python source file encoding comment, it works:
# coding: iso-8859-1
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ
[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ ¡¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶·¸¹º»
¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ
import linecache
print(linecache.getline('input.txt', 3))
[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ ¡¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶·¸¹º»
So linecache
probably isn't the solution to your issue. Instead, open the file as you've shown and perhaps cache the lines yourself:
with open('x.txt',encoding='iso-8859-1') as f:
lines = f.readlines()
print(lines[2])
You could also append lines to a list as they are read if you don't want to read the whole file, similar to linecache
.