I have a dataset as such:
#Load the required libraries
import pandas as pd
#Create dataset
data = {'id': [1, 1, 1, 1, 1,1, 1, 1, 1, 1, 1,
2, 2,2,2,2,
3, 3, 3, 3, 3, 3,
4, 4,4,4,4,4,4,4,
5, 5, 5, 5, 5,5, 5, 5,5],
'cycle': [1,2, 3, 4, 5,6,7,8,9,10,11,
1,2, 3,4,5,
1,2, 3, 4, 5,6,
1,2,3,4,5,6,7,8,
1,2, 3, 4, 5,6,7,8,9,],
'Salary': [7, 7, 7,8,9,10,11,12,13,14,15,
4, 5,6,7,8,
8,9,10,11,12,13,
8,1,2,3,4,5,6,7,
7, 7,9,10,11,12,13,14,15,],
'Children': ['No', 'Yes', 'Yes', 'Yes', 'Yes', 'No','No', 'Yes', 'Yes', 'Yes', 'No',
'Yes', 'No', 'Yes', 'No', 'Yes',
'No','Yes', 'Yes', 'No','No', 'Yes',
'Yes','Yes', 'Yes', 'No','No', 'Yes', 'Yes', 'Yes',
'No', 'Yes', 'No', 'No', 'Yes', 'Yes', 'Yes', 'Yes', 'No',],
'Days': [123, 128, 66, 66, 120, 141, 52,96, 120, 141, 52,
96, 120,128, 66, 120,
15,123, 128, 66, 120, 141,
141,128, 66, 123, 128, 66, 120,141,
123, 128, 66, 123, 128, 66, 120, 141, 52,],
}
#Convert to dataframe
df = pd.DataFrame(data)
print("df = \n", df)
The dataframe looks as such:
Here, every id has different cycles as per the 'cycle' column. For example,
id-1 has maximum 11 cycles.
id-2 has maximum 5 cycles.
id-3 has maximum 6 cycles.
id-4 has maximum 8 cycles.
id-5 has maximum 9 cycles.
Now, for every id, I wish to delete rows from the bottom.
For example,
For id-1, delete last four rows.
For id-2, delete last two rows.
For id-3, delete last three rows.
For id-4, delete last five rows.
For id-5, delete last six rows.
The dataframe then looks as such:
Can somebody please let me know how do I achieve this task in Python?
Create dictionary for specify number of deleted rows and compare by counter from back by GroupBy.cumcount
with ascending=False
mapped id
column by Series.map
, so possible filter by boolean indexing
:
d = {1:4,2:2,3:3,4:5,5:6}
df = df[df.groupby('id').cumcount(ascending=False).ge(df['id'].map(d))]
print (df)
id cycle Salary Children Days
0 1 1 7 No 123
1 1 2 7 Yes 128
2 1 3 7 Yes 66
3 1 4 8 Yes 66
4 1 5 9 Yes 120
5 1 6 10 No 141
6 1 7 11 No 52
11 2 1 4 Yes 96
12 2 2 5 No 120
13 2 3 6 Yes 128
16 3 1 8 No 15
17 3 2 9 Yes 123
18 3 3 10 Yes 128
22 4 1 8 Yes 141
23 4 2 1 Yes 128
24 4 3 2 Yes 66
30 5 1 7 No 123
31 5 2 7 Yes 128
32 5 3 9 No 66