I have this dataframe
df = pd.DataFrame({
'c1':['a','f,g,e','a,f,e,h','g,h','b,c,g,h',],
'c2':['1','1,1,0.5','1,2,2.5,1','3,1','2,-1,0.5,-1'],
'c3':['0.05','0.01,0.001,>0.5','>0.9,>0.9,0.01,0.002','>0.9,>0.9','0.05,0.1,<0.01,0.1'],
})
yielding
c1 c2 c3
a 1 0.05
f,g,e 1,1,0.5 0.01,0.001,>0.5
a,f,e,h 1,2,2.5,1 >0.9,>0.9,0.01,0.002
g,h 3,1 >0.9,>0.9
b,c,g,h 2,-1,0.5,-1 0.05,0.1,<0.01,0.1
I would like to combine c1,c2 and c3 to create new column c4 (see desired result below)
c1 c2 c3 c4
a 1 0.05 a(1|0.05)
f,g,e 1,1,0.5 0.01,0.001,>0.5 f(1|0.01),g(1|0.001),e(0.5|>0.5)
a,f,e,h 1,2,2.5,1 >0.9,>0.9,0.01,0.002 a(1|>0.9),f(2|>0.9),e(2.5|0.01),h(1|0.02)
g,h 3,1 >0.9,>0.9 g(3|>0.9),h(1|>0.9)
b,c,g,h 2,-1,0.5,-1 0.05,0.1,<0.01,0.1 b(2|0.05),c(-1|0.1),g(0.5<0.01),h(-1|0.1)
I tried working on answers provided to this question, and this question, but it did not work.
You can use a list comprehension with zip
, str.split
and str.join
:
df['c4'] = [','.join([f'{a}({b}|{c})' for a,b,c in
zip(*(y.split(',') for y in x))])
for x in zip(df['c1'], df['c2'], df['c3'])]
NB. the same can be done with apply
, but a list comprehension is generally more efficient.
Output:
c1 c2 c3 c4
0 a 1 0.05 a(1|0.05)
1 f,g,e 1,1,0.5 0.01,0.001,>0.5 f(1|0.01),g(1|0.001),e(0.5|>0.5)
2 a,f,e,h 1,2,2.5,1 >0.9,>0.9,0.01,0.002 a(1|>0.9),f(2|>0.9),e(2.5|0.01),h(1|0.002)
3 g,h 3,1 >0.9,>0.9 g(3|>0.9),h(1|>0.9)
4 b,c,g,h 2,-1,0.5,-1 0.05,0.1,<0.01,0.1 b(2|0.05),c(-1|0.1),g(0.5|<0.01),h(-1|0.1)