python pandas dataframe merge concatenation

Merge two dataframes groupby the column values of a dataframe

I have the following dataframe:

df1 = pd.DataFrame({'Parent': ['Stay home', "Stay home","Stay home", 'Go outside', "Go Outside","Go outside"],
                    'Child' : ['Severe weather', "Severe weather", "Severe weather", 'Sunny', "Sunny", "sunny"]})

    Parent      Child
0   Stay home   Severe weather
1   Stay home   Severe weather
2   Stay home   Severe weather
3   Go outside  Sunny
4   Go Outside  Sunny
5   Go outside  sunny

And a second one:

df2 = pd.DataFrame({'Similarity_Score': ['SimilarityScore:0.43693185876069784', 'SimilarityScore:0.299807821163373']})

             Similarity_Score
0   SimilarityScore:0.43693185876069784
1   SimilarityScore:0.299807821163373

I want to merge the two dataframes based on the values of Child values of df1.

Expected outcome:

     Parent     Child           Similarity_Score
0   Stay home   Severe weather  0.43693185876069784
1   Stay home   Severe weather  0.43693185876069784
2   Stay home   Severe weather  0.43693185876069784
3   Go outside  Sunny           0.299807821163373
4   Go Outside  Sunny           0.299807821163373
5   Go outside  sunny           0.299807821163373

I tried the usual merge and concat methods but could not find the solution. Any ideas?

Solution

If you want to assign the scores based on the Child's value, you can do it like that:

import numpy as np
import pandas as pd

df1 = pd.DataFrame({'Parent': ['Stay home', "Stay home", "Stay home", 'Go outside', "Go Outside", "Go outside"],
                    'Child': ['Severe weather', "Severe weather", "Severe weather", 'Sunny', "Sunny", "Sunny"]})
df2 = pd.DataFrame({'Similarity_Score': ['SimilarityScore:0.43693185876069784', 'SimilarityScore:0.299807821163373']})

# Split the string at : and convert to float
df2['Similarity_Score'] = df2['Similarity_Score'].str.split(':').str[1].astype(float)

# calculate auxiliary column position to base the matching on
df1['position'] = df1['Child'].apply(lambda row: np.where(df1['Child'].unique() == row)[0][0])

# merge both dataframes and drop auxiliary column position
df = df1.merge(df2, left_on='position', right_index=True).drop(columns=["position"])