for my use case the original frame looks like -
index | col1 | col2 | col3 |
---|---|---|---|
0 | 0 | zeroth eg | reject |
1 | 1 | first eg | accept |
2 | 2 | second eg | accept |
3 | 3 | third eg | reject |
I have a function defined as -
def foo(row):
if row['col1']==0:
answers = ['zero']
elif row['col1']==1:
answers = ['one', 'i']
elif row['col1']==2:
answers = ['two', 'ii']
else:
answers = ['three', 'iii']
Based on this function I want to add a new column called col4 to my dataframe. Essentially, as many new rows need to be added as there are values in the answers
list where col4's value in each row should be the subsequent values of the list (while values for all other columns remain same)
So I want the resulting frame to be like -
index | col1 | col2 | col3 | col4 |
---|---|---|---|---|
0 | 0 | zeroth eg | reject | zero |
1 | 1 | first eg | accept | one |
2 | 1 | first eg | accept | i |
3 | 2 | second eg | accept | two |
4 | 2 | second eg | accept | ii |
5 | 3 | third eg | reject | three |
6 | 3 | third eg | reject | iii |
I cannot understand how can we use apply to return rows and that too, multiple rows. Below code will just add a new column col4 to my original frame containing lists (if I return answers in foo
)
input_df['col4'] = input_df.apply(foo, axis=1)
How can I modify foo
to return multiple rows?
Any help appreciated.
You can try return list then explode
def foo(row):
if row['col1']==0:
answers = ['zero']
elif row['col1']==1:
answers = ['one', 'i']
elif row['col1']==2:
answers = ['two', 'ii']
else:
answers = ['three', 'iii']
return answers
input_df['col4'] = input_df.apply(foo, axis=1)
input_df = input_df.explode('col4', ignore_index=True)
print(input_df)
index col1 col2 col3 col4
0 0 0 zeroth eg reject zero
1 1 1 first eg accept one
2 1 1 first eg accept i
3 2 2 second eg accept two
4 2 2 second eg accept ii
5 3 3 third eg reject three
6 3 3 third eg reject iii