Search code examples
pythonpython-2.7readlines

python readlines() does not contain whole file


I have an auto-generated info file coming from a measurement. It consists of both binary as well as human readable parts. I want to extract some of the non binary meta data. For some files, I am not able to get to the meta data, as the readlines() does not yield the whole file. I guess that the file contains some EOF char. I can open the file in notepad++ without problems.

A possible solution to this problem would be to read in the file binary and parse it to char afterwards, deleting the EOF char while doing so. Anyhow, I wonder if there is a more elegant way to do so?

Edit: The question was rightfully downvoted, I should have provided code. I actually use

f = open(fname, 'r')
raw = f.readlines()

and then proceed with walking through the list. The EOF chars that are existing (depending on the OS) seem to cause the havoc I am observing. I will accept the answer that states using the binary 'rb' flag. By the way, this was an impressive response time! (-:


Solution

  • with open(afile,"rb") as f: print f.readlines()
    

    What's the problem with doing this?

    If you don't open the file in binary mode some non ASCII characters are incorrectly interpreted and or discarded... Which may inadvertently also remove some ASCII if it is mixed in with binary data