Search code examples
pythonpandasdataframepandas-explode

How to do explode and keep a fair proportion of number value for each new row Pandas


I have this dataframe:

           A  B 
0  [0, 1, 2]  1 
1        foo  1 
2     [3, 4]  1

I would like to use explode function for column "A" and then to keep right and fair proportion for each exploded row in case with column "B" . So the result should look like this:

     A  B 
0    0  0.33
0    1  0.33
0    2  0.33
1  foo  1 
2    3  0.5 
2    4  0.5

Would this be possible with the explode function? I would manage to come to this result with for row in data.itertuples(): but the for loop is so slow in case with large dataframe. So do you have idea how to solve this with explode or with some other fast way?

I would be very grateful with any help.


Solution

  • You can explode "A"; then groupby the index and transform count method (to count the number of each index) and divide the elements in 'B' by their corresponding index count.

    out = df.explode('A')
    out['B'] /= out['B'].groupby(level=0).transform('count')
    

    Output:

         A         B
    0    0  0.333333
    0    1  0.333333
    0    2  0.333333
    1  foo  1.000000
    2    3  0.500000
    2    4  0.500000