I have a data file like the following.
Index Code Pos1 Strand Chr2 Pos2 length blocks
1 G32_bkd.ctx:Vu01(old4) 62739 47+9- Vu01(old4) 63651 790 0
2 G32_bkd.ctx:Vu01(old4) 441403 10+0- Vu01(old4) 446263 4893 0
3 G32_bkd.ctx:Vu01(old4) 450546 15+0- Vu01(old4) 451091 576 0
4 G32_bkd.ctx:Vu01(old4) 459741 10+0- Vu01(old4) 460841 1068 0
5 G32_bkd.ctx:Vu01(old4) 612262 14+0- Vu01(old4) 629013 16788 0
6 G32_bkd.ctx:Vu01(old4) 688380 23+0- Vu01(old4) 693207 4872 0
7 G32_bkd.ctx:Vu01(old4) 730643 12+0- Vu01(old4) 740497 7011 0
8 G32_bkd.ctx:Vu01(old4) 834116 16+1- Vu01(old4) 835797 1752 0
I want to read the header line seperately and then read each line in a for loop. My code is
with open(file) as f:
title_line = f.readline()
for line in f:
line = line.strip()
cols = line.split()
When I checked print(line)
inside the for
loop, it doesn't print anything. But when I checked print(title_line)
, the entire file is printed preserving the exact format in the file. What went wrong?
N.B. So, I just copied and pasted the whole file and saved it in a different name and it worked just fine.
One thing that could cause that behavior would be if Python is for some reason not liking the end of line chars from the original file.
To confirm that, on Linux you could use od -t a file | less
, and inspect what's in there. Perhaps the file conforms to a different Operating System standard? If not on Linux, you can use Python itself to print each char with ord
to see what it is using (\n, \r, \r\n).
If that's the case, you have some options:
open (file, "U")
io.open
instead of open
, and use its newline=
argument. The default, None
, should be what you need.If this does not fix your issue, please provide:
As an unrelated side note, I'd suggest you check Python's built-in csv
module for reading your file. It seems like a perfect fit (the csv
module can be configured to use spaces or tabs, instead of commas)
References