I generated the following code:
cov_vac_merge['Partially Vaccinated'] = cov_vac_merge['First Dose'] - cov_vac_merge['Second Dose']
cov_vac_merge['% Partially Vaccinated'] = cov_vac_merge['Second Dose'] / cov_vac_merge['First Dose']
covid_summary= cov_vac_merge.groupby('State')[['Vaccinated','First Dose','Second Dose','Partially Vaccinated','% Partially Vaccinated']].sum().sort_values('Partially Vaccinated' ,ascending=False)
In the second line of the code where I try to divide Second Dose by First Dose, I do not get the right results. Below an example of the output I get:
State Vaccinated First Dose Second Dose Partially Vaccinated % Partially Vaccinated
UK 5606041 5870786 5606041 264745 527.854055
Instead of getting 527.85 for % Partially Vaccinated I should get 5606041/5870786 = 0.95. Anyone knows what am I doing wrong in the division part of my code ?
I think the code should be something like this:
covid_summary = cov_vac_merge.groupby('State').agg({
'Vaccinated': np.sum,
'First Dose': np.sum,
'Second Dose': np.sum,
'Partially Vaccinated': np.sum,
'% Partially Vaccinated' : np.mean,
})
For details about the grouping and aggregation in pandas, I recommend to read this official article.