I am doing SPC analysis using numpy/pandas.
Part of this is checking data series against the Nelson rules and the Western Electric rules.
For instance (rule 2 from the Nelson rules): Check if nine (or more) points in a row are on the same side of the mean.
Now I could simply implement checking a rule like this by iterating over the array.
As I mentioned in a comment, you may want to try using some stride tricks.
First, let's make an array of the size of your anomalies: we can put it as np.int8
to save some space
anomalies = x - x.mean()
signs = np.sign(anomalies).astype(np.int8)
Now for the strides. If you want to consider N
consecutive points, you'll use
from np.lib.stride_tricks import as_strided
strided = as_strided(signs,
strides=(signs.itemsize,signs.itemsize),
shape=(signs.shape,N))
That gives us a (x.size, N)
rollin array: the first row is x[0:N]
, the second x[1:N+1]
... Of course, the last N-1
rows will be meaningless, so from now on we'll use
strided = strided[:-N+1]
Let's sum along the rows
consecutives = strided.sum(axis=-1)
That gives us an array of size (x.size-N+1)
of values between -N
and +N
: we just have to find where the absolute values are N
:
(indices,) = np.nonzero(consecutives == N)
indices
is the array of the indices i
of your array x
for which the values x[i:i+N]
are on the same side of the mean...
Example with x=np.random.rand(10)
and N=3
>>> x = array([ 0.57016436, 0.79360943, 0.89535982, 0.83632245, 0.31046202,
0.91398363, 0.62358298, 0.72148491, 0.99311681, 0.94852957])
>>> signs = np.sign(x-x.mean()).astype(np.int8)
array([-1, 1, 1, 1, -1, 1, -1, -1, 1, 1], dtype=int8)
>>> strided = as_strided(signs,strides=(1,1),shape=(signs.size,3))
array([[ -1, 1, 1],
[ 1, 1, 1],
[ 1, 1, -1],
[ 1, -1, 1],
[ -1, 1, -1],
[ 1, -1, -1],
[ -1, -1, 1],
[ -1, 1, 1],
[ 1, 1, -106],
[ 1, -106, -44]], dtype=int8)
>>> consecutive=strided[:-N+1].sum(axis=-1)
array([ 1, 3, 1, 1, -1, -1, -1, 1])
>>> np.nonzero(np.abs(consecutive)==N)
(array([1]),)