Search code examples
pythonpython-3.xpandasdataframesklearn-pandas

How to represent data in pandas dataframe in a particular form by using concatenate and pivot table


I have two pandas dataframe as an output:

   Modeling Methods(Overall Themes & FR)  RMSE (CV=10)
0                                 Lasso     -0.559883
1                                   SVR     -0.642521
2                                 NuSVR     -0.602523
3             GradientBoostingRegressor     -0.773394
4                 RandomForestRegressor     -0.866475

and

     Modeling Methods(4 Themes & FR)  RMSE (CV=10)
   0                           Lasso     -0.559883
   1                             SVR     -0.655144
   2                           NuSVR     -0.639760
   3       GradientBoostingRegressor     -0.860851
   4           RandomForestRegressor     -0.818647

I want to join these two data frames in the following form:

                                            Lasso   SVR   NuSVR      GradientBoostingRegressor   RandomForestRegressor
0   Modeling Methods(4 Themes & FR)        -0.55   -0.65  -0.63          -0.86                     -0.81
1   Modeling Methods(Overall Themes & FR)  -0.55   -0.64  -0.60          -0.77                     -0.86

i have used the following code, but the result is not as per the expectation

frames = [factor_flood_response, only_flood_response, Theme4_flood_response,Overall_Theme_flood_response]
result = pd.concat(frames, axis=0, join='outer')
print(result)

Solution

  • Let's try this:

    pd.concat([i.set_index(i.columns[0]).rename(columns={'RMSE (CV=10)':i.columns[0]}).T for i in [df1,df2]])
    

    Output:

                                             Lasso       SVR     NuSVR  \
    Modeling Methos(Overall Themes & FR) -0.559883 -0.642521 -0.602523   
    Modeling Methods(4 Themes & FR)      -0.559883 -0.655144 -0.639760   
    
                                          GradientBoostingRegressor  \
    Modeling Methos(Overall Themes & FR)                  -0.773394   
    Modeling Methods(4 Themes & FR)                       -0.860851   
    
                                          RandomForestRegressor  
    Modeling Methos(Overall Themes & FR)              -0.866475  
    Modeling Methods(4 Themes & FR)                   -0.818647  
    

    Using list comprehension and set_index, with renaming some a couple of columns, we can get the above result where:

    print(df1)
    
      Modeling Methos(Overall Themes & FR)  RMSE (CV=10)
    0                                Lasso     -0.559883
    1                                  SVR     -0.642521
    2                                NuSVR     -0.602523
    3            GradientBoostingRegressor     -0.773394
    4                RandomForestRegressor     -0.866475
    
    print(df2)
    
      Modeling Methods(4 Themes & FR)  RMSE (CV=10)
    0                           Lasso     -0.559883
    1                             SVR     -0.655144
    2                           NuSVR     -0.639760
    3       GradientBoostingRegressor     -0.860851
    4           RandomForestRegressor     -0.818647