Search code examples
pythonpython-3.xpandasmergetransform

Pandas use assign/transform in pipe after a merge


I have two dfs

solar_part = pd.DataFrame(
     {'pool': 1,
     'orig': 635.1}, index = [0]
     )

solar_aod = pd.DataFrame(
     {'pool': [1,1,1,1],
      'MoP': [1,2,3,4],
     'prin': [113.1, 115.3, 456.6, 234.1]}
     )

Which I then merge together via pipeline and try and transform/assign a new variable

solar_p = (
    solar_aod
    .merge(solar_part, on = ['pool'], how = 'left')
    .assign(remn = ['prin'] / ['orig'])
    )

The assign (have tried transform as well) gives the error of TypeError: unsupported operand type(s) for /: 'list' and 'list' which I'm guessing is caused by the bracketing. Trying only quotes gives the same same error but with str instead of list. Not including an assign/transform function and then doing it "long hand" via solar_p.prin / solar_p.orig * 100 also works but I have several more equations to include, so I'd like it as concise as possible. How to do this tranform/assign after a merge in a pipeline?


Solution

  • You can use eval to create an additional column as well as assign but the latter need to evaluate the new dataframe before use columns that why you have to use a lambda function here:

    solar_p = (
        solar_aod
        .merge(solar_part, on='pool', how='left')
        .eval('remn = prin / orig')
    )
    

    Output:

    >>> solar_p
       pool  MoP   prin   orig      remn
    0     1    1  113.1  635.1  0.178082
    1     1    2  115.3  635.1  0.181546
    2     1    3  456.6  635.1  0.718942
    3     1    4  234.1  635.1  0.368603