Search code examples
python-3.xreadlines

Python identifying tab as a string when reading a data file


so I had been trying to debug this for a long time now and can't seem to find a solution.

Essentially, I opened a .txt file in excel as a tab delimited file in a MacOs. Then, copied and pasted the columns I wanted and created a new file. When I use readline() to read the first line in the file with my python script, the "\t" is read by python as a string.

For example:

line = column1 column2
10000.00 1000.00

This is the section of my script where the "error" happens:

 13 class read_file:
 14         def __init__(self,filePath):
 15                 self.filePath = filePath
 16                 self.infile = open(self.filePath,'r')
 17                 self.var_names = []
 18                 self.data = []
 19         def get_var_names(self):
 20                 var_names_str = (self.infile).readline().rstrip()
 21                 var_names_list = var_names_str.split(" ")
 22                 for name in var_names_list:
 23                         if name !="line" and name != "=":
 24                                 (self.var_names).append(name)
 25                 print("Headers to plot: {}".format(self.var_names)) 
 26                 return self.var_names

Output:

Headers to plot: ['column1\tcolumn2']

I created the file that contains column1 and column2 manually, because wanted to quickly plot the results and see if the graphs made sense before using pandas, there's some other post-processing work I have to do as well, so I thought copying and pasting shouldn't be an issue. But apparently it is.

If someone has a suggestion and reason why is this happening I would really appreciate the input! In the meantime I am still debugging my code.

Thanks!


Solution

  • line 21 should be :

    var_names_list = var_names_str.split("\t")
    

    to split at the tabs instead of including them.