Search code examples
csvdelimitermulti-index

Avoiding multindex read in read_csv


I'm trying to read a csv, that appears to have a problem in a specific line.

I'm trying to explore the problem, since I got the error

Error tokenizing data. C error: Expected 23 fields in line 27, saw 37

Here's what I discovered:

the first 26 lines are read OK:

zero=pd.read_csv(basepath/nome, low_memory=False, dtype=str, delimiter=";", nrows=25)

but at line 26, the function takes a multindex dataframe, even if the database doesn't have multindex.

zero=pd.read_csv(basepath/nome, low_memory=False, dtype=str, delimiter=";", skiprows=25)

Even forcing the index to None (index_col=None), the result is a multindex table (the table is shown with the first 9 cols of indexes)...

how can I avoid this and read the csv properly?


Solution

  • Error tokenizing data. C error: Expected 23 fields in line 27, saw 37
    

    Most likely there are separators inside some fields in that row.

    Make sure those fields are quoted and quotechar='"'.


    To handle quotes contained in a field like:

    "L.E.P. DI PIROZZI CARMINE S.A.S.\"";;;;;;;; "08020650019";
    

    escapechar='\\' can be used.