Search code examples
pythonpandaspython-itertools

How to map the weight and calculate the product?


This is my pandas dataframe, I have to calculate the weight in a new column 'Value.

For example, if the combination of columns (col1 col2 col3 col4) is 'Right_Wrong_Wrong_Right', then 'value' equals the product of weight, the total value should be 5 x 100 x 100 x 100 = 5,000,000.

I can't think of any ways that will allow me to map the rules to calculate the product of the weight if not hard code.


Solution

  • Something like this should work, although I can't really test it since you posted your dataframe as an image rather than as text.

    The right half of the dataframe is preprocessed into a python dict weights that maps each quadruple ('from', 'to', 'from_answer', 'to_answer') to the corresponding weight.

    This dict is used to write a function get_value that can be used to calculate the value of each row.

    You can apply this function to the dataframe using .apply(get_value, axis=1).

    from operator import itemgetter
    
    #df = pd.DataFrame({'from': ['u1', 'u1', 'u1', ...], 'to': ['u2', 'u2', 'u2', ...], 'from_answer': ['Right', 'Right', 'Wrong', ...], 'to_answer': ['Right', 'Wrong', 'Right', ...], 'weight': [30, 5, 1, ...], 'u1': ['Right']*3, 'u2': ['Right']*3, 'u3': ['Right', 'Right', 'Wrong', ...], 'u4': ['Right', 'Wrong', 'Right', ...]})
    
    weights = {itemgetter('from', 'to', 'from_answer', 'to_answer')(row): row['weight'] for _,row in df.iterrows()}
    # {('u1', 'u2', 'Right', 'Right'): 30, ('u1', 'u2', 'Right', 'Wrong'): 5, ('u1', 'u2', 'Wrong', 'Right'): 1, ...}
    
    def get_value(row):
        return sum(weights[(u, v, row[u], row[v])] for u,v in (('u1', 'u2'), ('u2', 'u3'), ('u3', 'u4'), ('u1', 'u4')))
    
    df['value'] = df.apply(get_value)