Search code examples
pythonpandasdataframesubtractionlookup-tables

Pandas subtract DataFrame column from lookup DataFrame column


Given a dataframe test and another dataframe lookup

test = pd.DataFrame( [['a',1],
                      ['a',2], 
                      ['b',9]], columns = ['id', 'n'])

lookup_mins = pd.DataFrame( [['a',1], 
                             ['b',9], 
                             ['c',7]] , columns = ['id', 'n'])

trying to subtract each value n in test by the value n with matching id in lookup_mins using

s = lookup_mins.groupby(['id'])['n'].transform('min')
test['n2'] = test['n'] - s

Expected result,

id n n2
a  1 0
a  2 1
b  9 0

but getting instead

id n n2
a  1 0
a  2 -7
b  9  9

How to subtract test and lookup_mins to get the expected result as stated above?


Solution

  • Use Series.map with aggregate min:

    s = lookup_mins.groupby(['id'])['n'].min()
    test['n2'] = test['n'] - test['id'].map(s)
    print (test)
      id  n  n2
    0  a  1   0
    1  a  2   1
    2  b  9   0