Search code examples
pythonpandasdataframeseries

Assigning a Series to a pandas Dataframe depending on multiple different column entries


I have both a series as well as a DataFrame:

Series:
1    128
2    112
3    110

Where the Series index indicates a Month

DataFrame

   month     c1       c2
0  1         a       0
1  2         a       0
2  3         a       0
3  1         b       0
4  2         b       0
5  5         b       0
6  1         c       0
7  2         c       0
8  1         d       0
9  2         d       0

I want to apply the values of the series, so that the months match, but only where c1 has a specific value.

I tried using

df = ...
series = ... 

df.loc[df['c1'] == 'a', 'c2'] = series

But this won't work, as it doesnt use month as index. How can I make it regard the month in DataFrame?

Expected Result (c1 = a):

   month     c1       c2
0  1         a       128
1  2         a       112
2  3         a       110
3  1         b       0
4  2         b       0
5  5         b       0
6  1         c       0
7  2         c       0
8  1         d       0
9  2         d       0

Additionally, if i use the same series to apply for c1==b, the expected result is this (as there is no month 3, but only 5)

Expected Result (c1 = b):

   month     c1       c2
0  1         a       0
1  2         a       0
2  3         a       0
3  1         b       128
4  2         b       112
5  5         b       0
6  1         c       0
7  2         c       0
8  1         d       0
9  2         d       0

Solution

  • For better performance filter in both sides with mapping by Series.map column month, if no match replace missing values by Series.fillna with downcast='int' parameter:

    m = df['c1'] == 'a'
    df.loc[m, 'c2'] = df.loc[m, 'month'].map(series).fillna(0, downcast='int')
    print (df)
       month c1   c2
    0      1  a  128
    1      2  a  112
    2      3  a  110
    3      1  b    0
    4      2  b    0
    5      5  b    0
    6      1  c    0
    7      2  c    0
    8      1  d    0
    9      2  d    0
    
    
    m = df['c1'] == 'b'
    df.loc[m, 'c2'] = df.loc[m, 'month'].map(series).fillna(0, downcast='int')
    print (df)
       month c1   c2
    0      1  a    0
    1      2  a    0
    2      3  a    0
    3      1  b  128
    4      2  b  112
    5      5  b    0
    6      1  c    0
    7      2  c    0
    8      1  d    0
    9      2  d    0