I have a data frame which has 10 columns. You can use this code to generate an example frame called df
.
cols = []
for i in range(1,11):
cols.append(f'x{i}')
df = pd.DataFrame(np.random.randint(10,99,size=(10, 10)), columns=cols)
The data frame will look something like this, it is randomly generated so your figures will be different.
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10
0 91 30 82 10 92 62 43 66 96 88
1 61 95 77 16 19 67 88 44 72 52
2 44 21 68 93 29 40 25 78 96 94
3 80 11 50 55 14 56 21 78 36 41
4 84 52 97 29 92 44 89 78 27 62
5 11 82 83 84 34 90 56 74 68 76
6 31 92 13 89 95 80 75 59 81 74
7 14 25 47 98 67 18 78 10 64 40
8 52 75 60 44 36 18 33 79 65 18
9 19 69 12 61 60 92 61 21 43 72
I want to apply a function which returns a tuple. I want to use the tuples to create 2 columns in my data frame.
def some_func(i1,i2):
o1 = i2 / i1 * 0.5
o2 = i2 * o1 * 6
return o1,o2
When I did this,
df['c1'], df['c2'] = df.apply(lambda row: some_func(row['x9'],row['x10']), axis=1)
I get this error,
ValueError: too many values to unpack (expected 2)
The output should look like this,
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 c1 c2
0 91 30 82 10 92 62 43 66 96 88 0.458333 242.000000
1 61 95 77 16 19 67 88 44 72 52 0.361111 112.666667
2 44 21 68 93 29 40 25 78 96 94 0.489583 276.125000
3 80 11 50 55 14 56 21 78 36 41 0.569444 140.083333
4 84 52 97 29 92 44 89 78 27 62 1.148148 427.111111
5 11 82 83 84 34 90 56 74 68 76 0.558824 254.823529
6 31 92 13 89 95 80 75 59 81 74 0.456790 202.814815
7 14 25 47 98 67 18 78 10 64 40 0.312500 75.000000
8 52 75 60 44 36 18 33 79 65 18 0.138462 14.953846
9 19 69 12 61 60 92 61 21 43 72 0.837209 361.674419
If I only return 1 output, and create 1 column it works fine. How do I output 2 items (tuple or list of 2 items) and create 2 new columns using this?
Since you need to loop through multiple columns by rows, a better / more efficient approach is to use zip
+ for loop to create a list of tuples which you can directly assign to a list of columns to the original data frame:
df[['c1', 'c2']] = [some_func(x, y) for x, y in zip(df.x9, df.x10)]
df
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 c1 c2
0 20 67 76 95 28 60 82 81 90 93 0.516667 288.300000
1 94 30 97 82 51 10 54 43 36 41 0.569444 140.083333
2 50 57 85 48 67 65 41 91 48 46 0.479167 132.250000
3 61 36 44 59 18 71 42 18 56 77 0.687500 317.625000
4 11 85 34 66 45 55 21 42 77 27 0.175325 28.402597
5 20 19 86 46 97 21 84 12 86 98 0.569767 335.023256
6 24 87 65 62 22 43 26 80 15 64 2.133333 819.200000
7 38 15 23 22 89 89 19 32 21 33 0.785714 155.571429
8 82 88 64 89 92 88 15 30 85 83 0.488235 243.141176
9 96 24 91 70 96 54 57 81 59 32 0.271186 52.067797