I have a list of ndarrays:
list1 = [t1, t2, t3, t4, t5]
Each t consists of:
t1 = np.array([[10,0.1],[30,0.05],[30,0.1],[20,0.1],[10,0.05],[10,0.05],[0,0.5],[20,0.05],[10,0.0]], np.float64)
t2 = np.array([[0,0.05],[0,0.05],[30,0],[10,0.25],[10,0.2],[10,0.25],[20,0.1],[20,0.05],[10,0.05]], np.float64)
...
Now I want for the whole list to get for each t the average of the values corresponding to the first element:
t1out = [[0,0.5],[10,(0.1+0.05+0.05+0)/4],[20,(0.1+0.05)/2],[30,0.075]]
t2out = [[0,0.05],[10,0.1875],[20,0.075],[30,0]]
....
After generating the t_1 ... t_n, I want to plot the probabilities over the classes for each t, where the first elements represent the classes (0,10,20,30) and the second elements show the probabilities of which these classes occurr (0.1,0.7,0.15,0). Something like a histogram or a probability distribution in form of a bar plot like:
plt.bar([classes],[probabilities])
plt.bar([item[0] for item in t1out],[item[1] for item in t1out])
Here's one approach using itertools.groupby
:
from statistics import mean
from itertools import groupby
def fun(t):
s = sorted(t, key=lambda x:x[0])
return [[k, mean(i[1] for i in v)] for k,v in groupby(s, key=lambda x: x[0])]
fun(t1)
[[0.0, 0.5],
[10.0, 0.05],
[20.0, 0.07500000000000001],
[30.0, 0.07500000000000001]]
And to apply to all arrays:
[fun(t) for t in [t1,t2]]
[[[0.0, 0.5],
[10.0, 0.05],
[20.0, 0.07500000000000001],
[30.0, 0.07500000000000001]],
[[0.0, 0.05], [10.0, 0.1875], [20.0, 0.07500000000000001], [30.0, 0.0]]]