Search code examples
pythonfilepandasmultiple-columns

pandas failing with variable columns


my file is this

    4 7 a a
    s g 6 8 0 d
    g 6 2 1 f 7 9 
    f g 3 
    1 2 4 6 8 9 0

I was using pandas to save it in form of pandas object. But I am getting the following error
pandas.parser.CParserError: Error tokenizing data. C error: Expected 6 fields in line 3, saw 8

The code I used was
file = pd.read_csv("a.txt",dtype = None,delimiter = " ")

Can anyone suggest an idea to include the file as such ?


Solution

  • Here's one way.

    In [50]: !type temp.csv
    4,7,a,a
    s,g,6,8,0,d
    g,6,2,1,f,7,9
    f,g,3
    1,2,4,6,8,9,0
    

    Read the csv to list of lists and then convert to DataFrame.

    In [51]: pd.DataFrame([line.strip().split(',') for line in open('temp.csv', 'r')])
    Out[51]:
       0  1  2     3     4     5     6
    0  4  7  a     a  None  None  None
    1  s  g  6     8     0     d  None
    2  g  6  2     1     f     7     9
    3  f  g  3  None  None  None  None
    4  1  2  4     6     8     9     0