Search code examples
pythonpandaspandas-groupbysklearn-pandas

Pandas - Counting rows in a df to discover the survival rate each day


.

Hello, guys!

I have a dfA (Table A) containing the amount of days that some products have been available (days_survived). I need to count the number of products that were available each day in total (Table B). I mean, I need counting rows in dfA to discover the survival rate each day for the first 5 days (df2).

Table A:

+-------+--------------+
| id    | days_survived|
+-------+--------------+
| 1     |  1           |
| 2     |  3           |
| 3     |  10          | 
| 4     |  40          |
| 5     |  4           |
| 6     |  9           |
+-------+--------------+

Table B (Expected results analyzing the first 5 days):

+-------+----------------+
| day   | #count_survived|
+-------+----------------+
| 1     |  6             |
| 2     |  5             |
| 3     |  5             | 
| 4     |  4             |
| 5     |  3             |
+-------+----------------+

This result means that on the first day 6 products in total were available, then only 5 on the second and third day, then only 4 on the fourth day and finally only 3 on the fifth day.

code:

# create df
import pandas as pd
d = {'id': [1,2,3,4,5,6], 'days_survived': [1,3,10,40,4,9]}
dfA = pd.DataFrame(data=d) 

Could anyone help me, please? :)


Solution

  • Use list comprehension with flattening and filtering and then count:

    comp = [y for x in dfA['days_survived'] for y in range(1, x + 1) if y < 6]
    s = pd.Series(comp).value_counts().rename_axis('day').reset_index(name='#count_survived')
    print (s)
       day  #count_survived
    0    1                6
    1    3                5
    2    2                5
    3    4                4
    4    5                3
    

    Another solution with Counter:

    from collections import Counter
    
    comp = [y for x in dfA['days_survived'] for y in range(1, x + 1) if y < 6]
    d = Counter(comp)
    df = pd.DataFrame({'day':list(d.keys()), '#count_survived':list(d.values())})