Search code examples
pythonpandasrollup

Rolling up each columnar value in a matrix


Here is the data.

 day_value = {
        'android':[1,0,0,0,0,0,1],
         'iphone':[0,1,0,1,0,0,0],
            'web':[0,1,1,0,1,0,0],
     }


device_rollup = {
    'overall':['iphone','android','web'],
     'mobile':['iphone','android'],
 }

rollup_l7 = {
    'overall': 6,
     'mobile': 4,
 }

If the values are same on each column we must rollup as 1 not add it up. The total for overall should be 6 ((1,0,0)+(0,1,1)+(0,0,1)+(0,1,0)+ (0,0,1)+ (1,0,0)-> 1+1+1+1+1+1 =6)

Now I can add up all the values like this.

overall_val = sum(sum(v) for k,v in day_value.items() if k in device_rollup['overall'] )
mobile_val = sum(sum(v) for k,v in day_value.items() if k in device_rollup['mobile'] )

rollup_l7= {'overall':overall_val, 'mobile':mobile_val}
print(rollups_l7)

But, I don't want to add up instead I would like to roll up each columnar value.

Do we need to convert the list to binary and then covert back to integer?

I am not quite sure how I can achieve in regular python without using numpy


Solution

  • Solution

    I don't know typically how big the dictionary day_value will be. But you could use pandas as an alternative nonetheless. The following code block will process all the data as you explained.

    # pip install pandas
    import pandas as pd
    
    ## define label categories for modularity
    categories = {
        'overall': ['android', 'iphone', 'web'], 
        'mobile': ['android', 'iphone'], 
        'rollup': ['overall', 'mobile']  
    }
    
    ## create a dataframe with input data
    ## and process for 'overall' and 'mobile'
    df = pd.DataFrame(day_value)
    df['overall'] = df[categories['overall']].sum(axis=1) > 0
    df['mobile'] = df[categories['mobile']].sum(axis=1) > 0
    
    ## evaluate totals of all columns in df
    total = df.sum(axis=0).astype(int)
    
    ## update device_rollup
    device_rollup = {'overall': [], 'mobile': []}
    for k, v in total[categories['overall']].to_dict().items(): 
        print(k, v)
        if v > 0:
            device_rollup['overall'].append(k)
            if k in categories['mobile']:
                device_rollup['mobile'].append(k) 
    
    ## update rollup_l7
    rollup_l7 = total[categories['rollup']].to_dict()
    

    Output:

    ## df: dataframe
    # print(df)
       android  iphone  web  overall  mobile
    0        1       0    0     True    True
    1        0       1    1     True    True
    2        0       0    1     True   False
    3        0       1    0     True    True
    4        0       0    1     True   False
    5        0       0    0    False   False
    6        1       0    0     True    True
    
    ## total
    # print(total.to_dict())
    {'android': 2, 'iphone': 2, 'mobile': 4, 'overall': 6, 'web': 3}
    
    ## device_rollup
    # print(device_rollup)
    {
        'mobile': ['android', 'iphone'], 
        'overall': ['android', 'iphone', 'web']
    }
    
    ## rollup_l7
    # print(rollup_l7)
    {'mobile': 4, 'overall': 6}
    

    Data

    day_value = {
        'android': [1,0,0,0,0,0,1],
        'iphone':  [0,1,0,1,0,0,0],
        'web':     [0,1,1,0,1,0,0],
    }