I have a dataset given as such in Python:
#Create dataset
data = {'id': [1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3],
'cycle': [1, 2, 3, 4, 5,6,7,8,9, 1, 2, 3,4,5,6, 1, 2, 3, 4,5,6,7,8],
'Salary': [0, 0, 0, 0,0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
'Days': [123, 128, 66, 120, 141, 128, 66, 120, 141, 52,96, 120, 141, 52,96, 120, 141,123,15,85,36,58,89],
}
#Convert to dataframe
df = pd.DataFrame(data)
print("df = \n", df)
Now, for every id/group, I wish to set value of 'Salary' as 1/0 based on certain number of cycles.
For example,
For id=1, for cycle >= 4, set 'Salary' as 1
For id=2, for cycle >= 3, set 'Salary' as 1
For id=3, for cycle >= 6, set 'Salary' as 1
The net result needs to look as such:
Can somebody please let me know how to achieve this task in python?
Here is an option using map()
. If there are more ID's they can be added to the dictionary.
d = {1:4,2:3,3:6}
df.assign(Salary = df['cycle'].ge(df['id'].map(d)).astype(int))
Output:
id cycle Salary Days
0 1 1 0 123
1 1 2 0 128
2 1 3 0 66
3 1 4 1 120
4 1 5 1 141
5 1 6 1 128
6 1 7 1 66
7 1 8 1 120
8 1 9 1 141
9 2 1 0 52
10 2 2 0 96
11 2 3 1 120
12 2 4 1 141
13 2 5 1 52
14 2 6 1 96
15 3 1 0 120
16 3 2 0 141
17 3 3 0 123
18 3 4 0 15
19 3 5 0 85
20 3 6 1 36
21 3 7 1 58
22 3 8 1 89