I'm trying to mask bad pixels in a dataset taken from a detector. In my attempt to come up with a general way to do this so I can run the same code across different images, I tried a few different methods, but none of them ended up working. I'm pretty new with coding and data analysis in Python, so I could use a hand putting things in terms that the computer will understand.
As an example, consider the matrix
A = np.array([[3,5,50],[30,2,6],[25,1,1]])
What I'm wanting to do is set any element in A that is two standard deviations away from the mean equal to zero. The reason for this is that later in the code, I'm defining a function that only uses the nonzero values for the calculation, since the zeros are part of the mask.
I know this masking technique works, but I tried extending the following code to work with the standard deviation:
mask = np.ones(np.shape(A))
mask.flat[A.flat > 20] = 0
What I tried was:
mask = np.ones(np.shape(A))
for i,j in A:
mask.flat[A[i,j] - 2*np.std(A) < np.mean(A) < A[i,j] + 2*np.std(A)] = 0
Which throws the error:
ValueError: too many values to unpack (expected 2)
If anyone has a better technique to statistically remove bad pixels in an image, I'm all ears. Thanks for the help!
==========
EDIT
After some trial and error, I got to a place that could help clarify my question. The new code is:
for i in A:
for j in i:
mask.flat[ j - 2*np.std(A) < np.mean(A) < j + 2*np.std(A)] = 0
This throws an error saying 'unsupported iterator index'. What I'm wanting to happen is that the for loop iterates across each element in the array, checks if it's less/greater than 2 standard deviations from the mean, and it is, sets it to zero.
Here is an approach that will be sligthly faster on larger images:
import numpy as np
import matplotlib.pyplot as plt
# generate dummy image
a = np.random.randint(1,5, (5,5))
# generate dummy outliers
a[4,4] = 20
a[2,3] = -6
# initialise mask
mask = np.ones_like(a)
# subtract mean and normalise to standard deviation.
# then any pixel in the resulting array that has an absolute value > 2
# is more than two standard deviations away from the mean
cond = (a-np.mean(a))/np.std(a)
# find those pixels and set them to zero.
mask[abs(cond) > 2] = 0
Inspection:
a
array([[ 1, 1, 3, 4, 2],
[ 1, 2, 4, 1, 2],
[ 1, 4, 3, -6, 1],
[ 2, 2, 1, 3, 2],
[ 4, 1, 3, 2, 20]])
np.round(cond, 2)
array([[-0.39, -0.39, 0.11, 0.36, -0.14],
[-0.39, -0.14, 0.36, -0.39, -0.14],
[-0.39, 0.36, 0.11, -2.12, -0.39],
[-0.14, -0.14, -0.39, 0.11, -0.14],
[ 0.36, -0.39, 0.11, -0.14, 4.32]])
mask
array([[1, 1, 1, 1, 1],
[1, 1, 1, 1, 1],
[1, 1, 1, 0, 1],
[1, 1, 1, 1, 1],
[1, 1, 1, 1, 0]])