Search code examples
pythonpandas

Pandas displaying the first row but not indexing it


I have a large text file, with a header of 18 lines.

If I try to display the entire dataframe:

df = pd.read_csv('my_log')
print(df)

I get: pandas.errors.ParserError: Error tokenizing data. C error: Expected 1 fields in line 19, saw 3

If I try to use exclude the header:

df = pd.read_csv('my_log', header=18)

I get the first row (line 19), then the second row (showing indexed at 0) No matter which index number I use in: print(df.loc[[0]]), I always get that first row displayed (no index number) before the row that I want.

I've checked out the text file, and every row ends in a CR/LF. I've also completely removed line 19; but, the same behavior occurs.

Also, if I completely remove the header and print the entire dataframe, I still get the same behavior. The first row prints (without an index number) and the row count is 1 less than the true row count.

Any suggestions greatly appreciated!


Solution

  • One approach is to skip the rows and define the columns

    column_names = ['Column1', 'Column2', 'Column3']
    df = pd.read_csv(file_path, skiprows=18, names=column_names)