Search code examples
pythonpandastuplesapplyunpack

Pandas output 2 column in data frame using apply function which returns a tuple / list of 2 items


I have a data frame which has 10 columns. You can use this code to generate an example frame called df.

cols = []
for i in range(1,11):
    cols.append(f'x{i}')

df = pd.DataFrame(np.random.randint(10,99,size=(10, 10)), columns=cols)

The data frame will look something like this, it is randomly generated so your figures will be different.

    x1  x2  x3  x4  x5  x6  x7  x8  x9  x10
0   91  30  82  10  92  62  43  66  96  88
1   61  95  77  16  19  67  88  44  72  52
2   44  21  68  93  29  40  25  78  96  94
3   80  11  50  55  14  56  21  78  36  41
4   84  52  97  29  92  44  89  78  27  62
5   11  82  83  84  34  90  56  74  68  76
6   31  92  13  89  95  80  75  59  81  74
7   14  25  47  98  67  18  78  10  64  40
8   52  75  60  44  36  18  33  79  65  18
9   19  69  12  61  60  92  61  21  43  72

I want to apply a function which returns a tuple. I want to use the tuples to create 2 columns in my data frame.

def some_func(i1,i2):
    o1 = i2 / i1 * 0.5
    o2 = i2 * o1 * 6
    return o1,o2

When I did this,

df['c1'], df['c2'] = df.apply(lambda row: some_func(row['x9'],row['x10']), axis=1)

I get this error,

ValueError: too many values to unpack (expected 2)

The output should look like this,

    x1  x2  x3  x4  x5  x6  x7  x8  x9  x10 c1          c2
0   91  30  82  10  92  62  43  66  96  88  0.458333    242.000000
1   61  95  77  16  19  67  88  44  72  52  0.361111    112.666667
2   44  21  68  93  29  40  25  78  96  94  0.489583    276.125000
3   80  11  50  55  14  56  21  78  36  41  0.569444    140.083333
4   84  52  97  29  92  44  89  78  27  62  1.148148    427.111111
5   11  82  83  84  34  90  56  74  68  76  0.558824    254.823529
6   31  92  13  89  95  80  75  59  81  74  0.456790    202.814815
7   14  25  47  98  67  18  78  10  64  40  0.312500    75.000000
8   52  75  60  44  36  18  33  79  65  18  0.138462    14.953846
9   19  69  12  61  60  92  61  21  43  72  0.837209    361.674419

If I only return 1 output, and create 1 column it works fine. How do I output 2 items (tuple or list of 2 items) and create 2 new columns using this?


Solution

  • Since you need to loop through multiple columns by rows, a better / more efficient approach is to use zip + for loop to create a list of tuples which you can directly assign to a list of columns to the original data frame:

    df[['c1', 'c2']] = [some_func(x, y) for x, y in zip(df.x9, df.x10)]
    
    df    
       x1  x2  x3  x4  x5  x6  x7  x8  x9  x10        c1          c2
    0  20  67  76  95  28  60  82  81  90   93  0.516667  288.300000
    1  94  30  97  82  51  10  54  43  36   41  0.569444  140.083333
    2  50  57  85  48  67  65  41  91  48   46  0.479167  132.250000
    3  61  36  44  59  18  71  42  18  56   77  0.687500  317.625000
    4  11  85  34  66  45  55  21  42  77   27  0.175325   28.402597
    5  20  19  86  46  97  21  84  12  86   98  0.569767  335.023256
    6  24  87  65  62  22  43  26  80  15   64  2.133333  819.200000
    7  38  15  23  22  89  89  19  32  21   33  0.785714  155.571429
    8  82  88  64  89  92  88  15  30  85   83  0.488235  243.141176
    9  96  24  91  70  96  54  57  81  59   32  0.271186   52.067797