I have a list of lists with sublists all of which contain float values. For example the one below has 2 lists with sublists each:
mylist = [[[2.67, 2.67, 0.0, 0.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [0.0, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0]], [[2.67, 2.67, 2.0, 2.0], [0.0, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [0.0, 0.0, 0.0, 0.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0]]]
I want to calculate the standard deviation and the mean of the sublists and what I applied was this:
mean = [statistics.mean(d) for d in mylist]
stdev = [statistics.stdev(d) for d in mylist]
but it takes also the 0.0 values that I do not want because I turned them to 0 in order not to be empty ones. Is there a way to ignore these 0s as they do not exist in the sublist?To not take them under consideration at all? I could not find a way for how I am doing it.
You can use numpy's nanmean
and nanstd
functions.
import numpy as np
def zero_to_nan(d):
array = np.array(d)
array[array == 0] = np.NaN
return array
mean = [np.nanmean(zero_to_nan(d)) for d in mylist]
stdev = [np.nanstd(zero_to_nan(d)) for d in mylist]