My question is for python pandas. I have two Series and each Series has elements of string as follows: To simplify, I've concatenated two Series in DataFrame.
import pandas as pd
import numpy as np
my_df = pd.DataFrame([['ab', 'bz', 'b'], ['cd', 'ct', 'c'], ['ef', 'ka', np.nan]], columns=['sr_1', 'sr_2', 'intersection'])
Any ideas for this?
This is what you can do:
import pandas as pd
import numpy as np
df1 = pd.DataFrame({'sr1' : ['ab','cd','ef'] ,
'sr2' : ['bz','ct','ka',]})
df1['intersection'] = df1.apply(lambda x: set(x.sr1) & set(x.sr2), axis=1)
df1['intersection'] = df1.intersection.apply(lambda x: list(x)[0] if len(x)>0 else np.nan)
The output: