Search code examples
pythonpandasfor-looprows

How to rename the rows in dataframe using pandas read (Python)?


I want to rename rows in python program (version - spyder 3 - python 3.6) . At this point I have something like that:

import pandas as pd
data = pd.read_csv(filepath, delim_whitespace = True, header = None)

Before that i wanted to rename my columns:

data.columns = ['A', 'B', 'C']

It gave me something like that.

    A   B   C  
0   1   n   1  
1   1   H   0  
2   2   He  1  
3   3   Be  2  

But now, I want to rename rows. I want:

     A   B   C  
n    1   n   1  
H    1   H   0  
He   2   He  1  
Be   3   Be  2

How can I do it? The main idea is to rename every row created by pd.read by the data in the B column. I tried something like this:

for rows in data:
    data.rename(index={0:'df.loc(index, 'B')', 1:'one'})

but it's not working.

Any ideas? Maybe just replace the data frame rows by column B? How?


Solution

  • I think need set_index with rename_axis:

    df1 = df.set_index('B', drop=False).rename_axis(None)
    

    Solution with rename and dictionary:

    df1 = df.rename(dict(zip(df.index, df['B'])))
    
    print (dict(zip(df.index, df['B'])))
    {0: 'n', 1: 'H', 2: 'He', 3: 'Be'}
    

    If default RangeIndex solution should be:

    df1 = df.rename(dict(enumerate(df['B'])))
    
    print (dict(enumerate(df['B'])))
    {0: 'n', 1: 'H', 2: 'He', 3: 'Be'}
    

    Output:

    print (df1)
        A   B  C
    n   1   n  1
    H   1   H  0
    He  2  He  1
    Be  3  Be  2
    

    EDIT:

    If dont want column B solution is with read_csv by parameter index_col:

    import pandas as pd
    
    temp=u"""1 n 1
    1 H 0
    2 He 1
    3 Be 2"""
    #after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
    df = pd.read_csv(pd.compat.StringIO(temp), delim_whitespace=True, header=None, index_col=[1])
    print (df)
        0  2
    1       
    n   1  1
    H   1  0
    He  2  1
    Be  3  2