I have a dataframe where i want to count a specific value that occurs in a row. This code below gives the right answer and now i want to add a new coluumn to my dataframe
occur = df.groupby(['Code_5elaag','Essentieel_Optioneel']).size()
occur
**Code_5elaag Essentieel_Optioneel**
1101 essentieel 8
optioneel 8
1102 essentieel 8
optioneel 51
1103 essentieel 8
..
96231 optioneel 6
96232 essentieel 1
optioneel 2
96290 essentieel 9
optioneel 17
When i assign a new colum to the frame this is the output:
uniq['ess'] = df.groupby(['Code_5elaag'])['Essentieel_Optioneel'].transform(np.size)
Code_5elaag Omschrijving_5elaag Soort_Skill Aantal_skills ess
0 1101 Officieren landmacht taken 16 16 15
16 1102 Officieren luchtmacht taken 59 59 59
75 1103 Officieren marechaussee taken 16 16 16
But that is not what i want i want to divide the amount of Aantal_skills to how much is essentieel and optioneel fo for the first row it should be 8 essentieel and 8 optional
You are close, need grouping by both columns:
df['ess'] = df.groupby(['Code_5elaag','Essentieel_Optioneel'])['Essentieel_Optioneel'].transform('size')
If need 2 new columns use crosstab
with DataFrame.join
:
out = df.join(pd.crosstab(df['Code_5elaag'], df['Essentieel_Optioneel']), on='Code_5elaag')