I have a large text file, with a header of 18 lines.
If I try to display the entire dataframe:
df = pd.read_csv('my_log')
print(df)
I get:
pandas.errors.ParserError: Error tokenizing data. C error: Expected 1 fields in line 19, saw 3
If I try to use exclude the header:
df = pd.read_csv('my_log', header=18)
I get the first row (line 19), then the second row (showing indexed at 0)
No matter which index number I use in: print(df.loc[[0]])
, I always get that first row displayed (no index number) before the row that I want.
I've checked out the text file, and every row ends in a CR/LF. I've also completely removed line 19; but, the same behavior occurs.
Also, if I completely remove the header and print the entire dataframe, I still get the same behavior. The first row prints (without an index number) and the row count is 1 less than the true row count.
Any suggestions greatly appreciated!
One approach is to skip the rows and define the columns
column_names = ['Column1', 'Column2', 'Column3']
df = pd.read_csv(file_path, skiprows=18, names=column_names)