Search code examples
pythonpandascsvdataframe

How to get rid of "Unnamed: 0" column in a pandas DataFrame read in from CSV file?


I have a situation wherein sometimes when I read a csv from df I get an unwanted index-like column named unnamed:0.

file.csv

,A,B,C
0,1,2,3
1,4,5,6
2,7,8,9

The CSV is read with this:

pd.read_csv('file.csv')

   Unnamed: 0  A  B  C
0           0  1  2  3
1           1  4  5  6
2           2  7  8  9

This is very annoying! Does anyone have an idea on how to get rid of this?


Solution

  • It's the index column, pass pd.to_csv(..., index=False) to not write out an unnamed index column in the first place, see the to_csv() docs.

    Example:

    In [37]:
    df = pd.DataFrame(np.random.randn(5,3), columns=list('abc'))
    pd.read_csv(io.StringIO(df.to_csv()))
    
    Out[37]:
       Unnamed: 0         a         b         c
    0           0  0.109066 -1.112704 -0.545209
    1           1  0.447114  1.525341  0.317252
    2           2  0.507495  0.137863  0.886283
    3           3  1.452867  1.888363  1.168101
    4           4  0.901371 -0.704805  0.088335
    

    compare with:

    In [38]:
    pd.read_csv(io.StringIO(df.to_csv(index=False)))
    
    Out[38]:
              a         b         c
    0  0.109066 -1.112704 -0.545209
    1  0.447114  1.525341  0.317252
    2  0.507495  0.137863  0.886283
    3  1.452867  1.888363  1.168101
    4  0.901371 -0.704805  0.088335
    

    You could also optionally tell read_csv that the first column is the index column by passing index_col=0:

    In [40]:
    pd.read_csv(io.StringIO(df.to_csv()), index_col=0)
    
    Out[40]:
              a         b         c
    0  0.109066 -1.112704 -0.545209
    1  0.447114  1.525341  0.317252
    2  0.507495  0.137863  0.886283
    3  1.452867  1.888363  1.168101
    4  0.901371 -0.704805  0.088335