Search code examples
pythonpandaslistdataframerename

rename columns of second dataframe with column names of first dataframe based on a list


I want to rename the name of the columns of df2 by the name of the columns of df1 and print the new df2 dataframe. I also want to drop the columns that are not listed in "df1_cols_to_rename_df2" from the new df2

import pandas as pd
    
    
data1 = {'first_column':  ['1', '2', '2'],
            'second_column': ['1', '2', '2'],
           'third column':['1', '2', '2'],
          'fourth_column':['1', '2', '2'],
           'fifth_column':['1', '2', '2'],
            }
    
data2 = {'1st_column':  ['1', '2', '4'],
            'some_column': ['1', '2', '2'],
            '3rd_column':['1', '2', '2'],
            '4th_column':['7', '2', '2'],
            '5th_column':['1', '4', '2'],
            '2nd_column':['1', '5', '3'],
            }
    
df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)

df1_cols_to_rename_df2 = {'first_column':['1st_column'], 'second_column':['2nd_column'], 'third column':['3rd_column'],'fourth_column':['4th_column']]

so this would be the expected output

enter image description here


Solution

  • NB: the df1_cols_to_rename_df2 DO NOT have fifth columns, but is present in the expected. There is no clarity as to why. assumed to be a typo in the OP.

    # invert the key values in the dict df1_cols_to_rename_df2
    d={ df1_cols_to_rename_df2[k][0]: k for k in df1_cols_to_rename_df2.keys()}
    
    # choose the columns (values) in the dict and rename these
    df2.loc[:, df2.columns.isin(d.keys())].rename(columns=d )
    
    first_column    third column    fourth_column   second_column
    0   1   1   7   1
    1   2   2   2   5
    2   4   2   2   3