I have three arrays (or lists, or whatever):
x
- The abscissa. A set of data which is not evenly spaced.
y
- The ordinate. A set of data representing y=f(x)
.
peaks
- A set of data whose elements contain and ordered pair (x,y)
, which represent the peak values found in y
.
Here is a portion of the x
and y
data:
2.00 1.5060000000e-07
...
...
5.60 3.4100000000e-08
5.80 1.7450000000e-07
6.00 7.1700000000e-08
6.20 5.2900000000e-08
6.40 2.5570000000e-07
6.50 4.8420000000e-07
6.60 6.1900000000e-08
6.80 2.2700000000e-07
7.00 2.3500000000e-08
7.20 3.6500000000e-08
7.40 1.0158000000e-06
7.50 3.5100000000e-08
7.60 2.0080000000e-07
7.80 1.6585000000e-06
8.00 2.1190000000e-07
8.20 5.3370000000e-07
8.40 5.7840000000e-07
8.50 4.5230000000e-07
...
...
50.00 1.8200000000e-07
Here is print(peaks)
:
[(3.7999999999999998, 4.0728000000000002e-06), (5.4000000000000004, 5.4893000000000001e-06), (10.800000000000001, 1.2068e-05), (12.699999999999999, 4.1904799999999999e-05), (14.300000000000001, 8.3118000000000006e-06), (27.699999999999999, 6.5239000000000003e-06)]
I use the data to make a plot, similar to this:
The blue dots in the plot are the peaks. And the red dots are the valleys. But the red dots are not necessarily accurate. You can see there is a red dot to the right of the last peak. That was not intended.
Using the data above, I am attempting to find the valleys as follows:
Go through the peaks
array (or list, or whatever it is) and for each adjacent pair of peaks, find their indices in the x
and y
arrays (or lists, or whatever they are), then search the y
array bound by those indices for the minimum value. Also find the corresponding x
value at that index. Then append the (x,y)
pair to an array v1
(or list, or whatever), which will be like peaks
. Then plot v1
as red dots.
Here is the code:
for i in xrange(1,len(peaks)):
# Find the indices of the two peaks in the actual arrays
# (e.g. x[j1] and y[j1]) where the peaks occur
j1=np.where(x==peaks[i-1][0])
j1=int(j1[0])
j2=np.where(x==peaks[i][0])
j2=int(j2[0])
# In the array y[j1:j2], find the index of the minimum value
j=np.where(y==min(y[j1:j2]))
# What if there are more than one minumum?
if(len(j[0])>1):
# Use the first one.
# I incorrectly assumed this would be > j1,
# but it could be anywhere in y
jt=int(j[0][0])
v1.append((x[jt],y[jt]))
# And the last one.
# I incorrectly assumed this would be < j2,
# but it could be anywhere in y. But we do know at least one of the
# indices found will be between j1 and j2.
jt=int(j[0][-1])
v1.append((x[jt],y[jt]))
else:
# When only 1 index is found, no problem: it has to be j1 < j < j2
j=int(j[0])
v1.append((x[j],y[j]))
Here is the problem:
When I search for the minimum value(s) of y
in a certain range like this:
j=np.where(y==min(y[j1:j2]))
It returns the indices of those minimum throughout the entire data set of y
. But I want j
to contain only the indices of the minimum between j1
and j2
, where I searched.
How can I constrain the search?
I could check to see if j1 < j < j2, but I would prefer to constrain the search to return only values of j in that range, if possible.
Once I figure that out, then I will add logic to limit the indices if the peaks are more than a width w
apart.
So if the peaks are more than w
apart, then j1
will be no less than j2-w/2
, where j2
is the index of the peak.
You could slice the array before and do the ==
comparison with the slice:
sliced_y = y[j1:j2]
j = np.where(sliced_y == min(sliced_y))[0] + j1
You need to +
the lower bound, otherwise you only have the "index" with respect to the sliced part.