i'm trying to compute the weights for each time series or data values by the groupby of "plant_name" and "month" and I cannot see the solution. I have data that looks like this -
plant_name month adjusted_wspd
0 ARIZONA I 1 7.62
1 ARIZONA I 2 7.37
2 ARIZONA I 3 6.72
3 ARIZONA I 4 6.67
4 ARIZONA I 5 7.07
5 ARIZONA I 6 7.53
6 ARIZONA I 7 8.09
7 ARIZONA I 8 8.83
8 ARIZONA I 9 9.41
9 ARIZONA I 10 9.44
10 ARIZONA I 11 8.92
11 ARIZONA I 12 8.37
12 CAETITE I 1 7.10
13 CAETITE I 2 7.27
14 CAETITE I 3 7.26
I have tried writing a function and using "apply" to the defined function but I get an incorrect solution that looks like this because the function is probably doing the math incorrectly:
ARIZONA I 1 1.0
2 1.0
3 1.0
4 1.0
5 1.0
6 1.0
7 1.0
8 1.0
9 1.0
10 1.0
11 1.0
12 1.0
CAETITE I 1 1.0
2 1.0
3 1.0
My code looks like this:
def my_agg(x):
names = {'Weigths_WSPD': (x['adjusted_wspd'] )/x['adjusted_wspd'].sum()}
return pd.Series(names, index=['Weights_WSPD'])
dfmm.groupby(["plant_name", "month"]).apply(my_agg)
My answer should look like this in percent (%):
ARIZONA I 1 7.93
2 7.67
3 7.00
4 6.95
5 7.36
6 7.84
7 8.42
8 9.19
9 9.80
10 9.83
11 9.29
12 8.72
CAETITE I 1 33.82
2 33.61
3 33.56
thank you!
Let's try groupby transform
sum on just plant_name
not plant_name
and month
:
df['Weigths_WSPD'] = (
df['adjusted_wspd'] /
df.groupby('plant_name')['adjusted_wspd'].transform('sum') *
100
).round(2)
plant_name month adjusted_wspd Weigths_WSPD
0 ARIZONA I 1 7.62 7.93
1 ARIZONA I 2 7.37 7.67
2 ARIZONA I 3 6.72 7.00
3 ARIZONA I 4 6.67 6.95
4 ARIZONA I 5 7.07 7.36
5 ARIZONA I 6 7.53 7.84
6 ARIZONA I 7 8.09 8.42
7 ARIZONA I 8 8.83 9.19
8 ARIZONA I 9 9.41 9.80
9 ARIZONA I 10 9.44 9.83
10 ARIZONA I 11 8.92 9.29
11 ARIZONA I 12 8.37 8.72
12 CAETITE I 1 7.10 32.82
13 CAETITE I 2 7.27 33.61
14 CAETITE I 3 7.26 33.56
If a Series
is necessary:
s = df.set_index(['plant_name', 'month'])
s = (
s['adjusted_wspd'] /
s.groupby(level=0)['adjusted_wspd'].transform('sum') *
100
).round(2).rename('Weigths_WSPD')
plant_name month
ARIZONA I 1 7.93
2 7.67
3 7.00
4 6.95
5 7.36
6 7.84
7 8.42
8 9.19
9 9.80
10 9.83
11 9.29
12 8.72
CAETITE I 1 32.82
2 33.61
3 33.56
Name: Weigths_WSPD, dtype: float64
DataFrame constructor:
df = pd.DataFrame({
'plant_name': {0: 'ARIZONA I', 1: 'ARIZONA I', 2: 'ARIZONA I',
3: 'ARIZONA I', 4: 'ARIZONA I', 5: 'ARIZONA I',
6: 'ARIZONA I', 7: 'ARIZONA I', 8: 'ARIZONA I',
9: 'ARIZONA I', 10: 'ARIZONA I', 11: 'ARIZONA I',
12: 'CAETITE I', 13: 'CAETITE I', 14: 'CAETITE I'},
'month': {0: 1, 1: 2, 2: 3, 3: 4, 4: 5, 5: 6, 6: 7, 7: 8, 8: 9, 9: 10,
10: 11, 11: 12, 12: 1, 13: 2, 14: 3},
'adjusted_wspd': {0: 7.62, 1: 7.37, 2: 6.72, 3: 6.67, 4: 7.07, 5: 7.53,
6: 8.09, 7: 8.83, 8: 9.41, 9: 9.44, 10: 8.92, 11: 8.37,
12: 7.1, 13: 7.27, 14: 7.26}
})