.
Hello, guys!
I have a dfA
(Table A) containing the amount of days that some products have been available (days_survived
). I need to count the number of products that were available each day in total (Table B). I mean, I need counting rows in dfA
to discover the survival rate each day for the first 5 days (df2
).
Table A:
+-------+--------------+
| id | days_survived|
+-------+--------------+
| 1 | 1 |
| 2 | 3 |
| 3 | 10 |
| 4 | 40 |
| 5 | 4 |
| 6 | 9 |
+-------+--------------+
Table B (Expected results analyzing the first 5 days):
+-------+----------------+
| day | #count_survived|
+-------+----------------+
| 1 | 6 |
| 2 | 5 |
| 3 | 5 |
| 4 | 4 |
| 5 | 3 |
+-------+----------------+
This result means that on the first day 6 products in total were available, then only 5 on the second and third day, then only 4 on the fourth day and finally only 3 on the fifth day.
code:
# create df
import pandas as pd
d = {'id': [1,2,3,4,5,6], 'days_survived': [1,3,10,40,4,9]}
dfA = pd.DataFrame(data=d)
Could anyone help me, please? :)
Use list comprehension with flattening and filtering and then count:
comp = [y for x in dfA['days_survived'] for y in range(1, x + 1) if y < 6]
s = pd.Series(comp).value_counts().rename_axis('day').reset_index(name='#count_survived')
print (s)
day #count_survived
0 1 6
1 3 5
2 2 5
3 4 4
4 5 3
Another solution with Counter
:
from collections import Counter
comp = [y for x in dfA['days_survived'] for y in range(1, x + 1) if y < 6]
d = Counter(comp)
df = pd.DataFrame({'day':list(d.keys()), '#count_survived':list(d.values())})