Apply z-score across all attributes by country

I'm trying to clean up a dataset that has data on every country in the world from 2000-2015. The population data by year is quite bad - I want to assign a z scores for each country's population data by year so I can see which data points to drop as outliers. How would I do this? I'm thinking I need to use groupby(), but I'm not sure how to deploy it.

I'm working with this WHO Kaggle dataset: https://www.kaggle.com/kumarajarshi/life-expectancy-who/data#

The data generally looks like this:

Example

Solution

Maybe, something like this might work -

import numpy as np, pandas as pd
l1 = ['a'] * 5 + ['b'] * 10 + ['c'] * 8
l2 = list(np.random.randint(10,20,size=5)) + list(np.random.randint(100,150, size=10)) + list(np.random.randint(75,100, size=8))
df = pd.DataFrame({'cat':l1, 'values':l2}) #creating a dummy dataframe
df
    cat  values
0    a      18
1    a      17
2    a      11
3    a      13
4    a      11
5    b     102
6    b     103
7    b     119
8    b     113
9    b     100
10   b     113
11   b     102
12   b     108
13   b     128
14   b     126
15   c      75
16   c      96
17   c      81
18   c      90
19   c      80
20   c      95
21   c      96
22   c      86

df['z-score'] = df.groupby(['cat'])['values'].apply(lambda x: (x - x.mean())/x.std())
df

     cat  values   z-score
0    a      18     1.206045
1    a      17     0.904534
2    a      11    -0.904534
3    a      13    -0.301511
4    a      11    -0.904534
5    b     102    -0.919587
6    b     103    -0.821759
7    b     119     0.743496
8    b     113     0.156525
9    b     100    -1.115244
10   b     113     0.156525
11   b     102    -0.919587
12   b     108    -0.332617
13   b     128     1.623951
14   b     126     1.428295
15   c      75    -1.520176
16   c      96     1.059516
17   c      81    -0.783121
18   c      90     0.322461
19   c      80    -0.905963
20   c      95     0.936674
21   c      96     1.059516
22   c      86    -0.168908