I am using this line to read part of the lines in a txt file, skipping header and footer.
np_data= np.loadtxt(file, delimiter= "\t", skiprows=12, max_rows= 1024)
The problem is that in the footer there is this character: ∞, which causes the following error
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb0 in position 4729: invalid start byte
Is there a way to skip that character or line? For me the combination of skiprows and max_rows does not seem to work. Thank you
Is there a way to skip that (...)line?
numpy.loadtxt
first argument might be
File, filename, list, or generator to read. If the filename extension is .gz or .bz2, the file is first decompressed. Note that generators must return bytes or strings. The strings in a list or produced by a generator are treated as lines.
thus you might envelope file handle to skip lines which you do not want, consider following simple example, let file.csv
content be
1,2,3
4,∞,6
7,8,9
then
import numpy as np
with open("file.csv","rb") as f:
arr = np.loadtxt(filter(lambda x:b"\xe2\x88\x9e" not in x,f), delimiter=",")
print(arr)
gives output
[[1. 2. 3.]
[7. 8. 9.]]
Explanation: I open file.csv
in binary mode, then use filter
to select lines from file handle f
which do not contain sequence of bytes \xe2\x88\x9e
(which is ∞ in Unicode)