Search code examples
pythonpandasdataframeconcatenationmirroring

Concatenate by mirroring same valuecolumns of different dataframes


I just want to make one dataframe (df3) with concatenate and by mirroring same columns values of two different dataframes (df1 and df2). Its just an example my datasets are way bigger.

> df1
  id_synthese1 format_1 ville
0             .dat       tours                      
1             .map       bordeaux
2             .sig       marseille                       

> df2
  id_synthese2 format_2  etude
0             .map       environnement                      
1             .sig       geotec
2             .dat       sismique

I would like an output like

> df3
  id_synthese1 format_1 ville       id_synthese2      format_2     etude    
0             .dat       tours                  2     .dat         sismique
1             .map       bordeaux               0     .map         environnement
2             .sig       marseille              1     .sig         geotec

The "mirroring" is between the columns: format_1 and format_2

I have tried

df3= pd.merge(df1, df2, on=['format_1','format_2'], how='outer')

and

df3= df1.join(
    df2.set_index(['format_1','format_2']),
    lsuffix="_x",
    rsuffix="_y",
    on=['format_1','format_2'])

and

df3= pd.concat([df1, df2], axis=1)

Thank you (i know that's an easy one... I am a beginner)


Solution

  • pass left_on = "format_1" and right_on = "format_2" instead of on parameter in your pd.merge call

    https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.merge.html