I am trying to find the prominent peaks of some sets of data. One of my data sets looks like the following:
As you can see in this there is a clear peak at approx w = -1.6. This is a prominent peak as there are no other peaks in the data set that are more than 10% of its height. So I store this w value as it is a prominent peak.
I also have another data set:
Here you can see I have a large peak at approx w = 1.6. However, there is also a peak at approx w = 0.9 which has a Spectral Power that is larger than 10% of the largest peak. So I do not store this w value as it is not a prominent peak.
I am trying to create a function so that for a given set of Spectral power called y. The function checks if the largest peak is prominent (i.e there are no other peaks that are more than 10% the size of the largest peak). If largest peak is prominent then it prints the w value the peak occurs at. If it is not prominent then it prints("Not prominent").
from scipy.signal import find_peaks, peak_widths, peak_prominences
def check_prominent_peaks(y):
peaks, _ = find_peaks(y)
peak_heights = y[peaks]
# Find the peak with the maximum height
max_peak_index = np.argmax(peak_heights)
max_peak_height = peak_heights[max_peak_index]
threshold = 10
# Check if the maximum peak 10 times larger than the others
if max_peak_height > threshold * np.max(peak_heights[peak_heights != max_peak_height]):
print(w[peaks][max_peak_index])
else:
print("Not Prominent")
However, this includes each individual data point as a peak. I want an individual peak to be the combination of the data points. Below is a snippet of the data.
w_data1 = np.array([-3.14159265e+00, -3.03687290e+00, -2.77507351e+00, -2.61799388e+00,
-2.35619449e+00, -2.09439510e+00, -1.80641578e+00, -1.72787596e+00,
-1.71740398e+00, -1.61268423e+00, -1.60744824e+00, -1.60221225e+00,
-1.59697627e+00, -1.59174028e+00, -1.58650429e+00, -1.58126830e+00,
-1.57079633e+00, -1.56556034e+00, -1.56032435e+00, -1.55508836e+00,
-1.54985238e+00, -1.51843645e+00, -1.46607657e+00, -1.41371669e+00,
-1.36135682e+00, -1.30899694e+00, -1.04719755e+00, -7.85398163e-01,
-5.23598776e-01, -2.61799388e-01, -4.44089210e-16, 2.61799388e-01,
5.23598776e-01, 1.04719755e+00, 1.57079633e+00, 2.09439510e+00,
2.61799388e+00])
PS_data1 = array([1.95875121e+01, 2.17955953e+01, 3.02014494e+01, 3.86345651e+01,
6.58105570e+01, 1.46458509e+02, 8.09563221e+02, 2.21754612e+04,
6.40104090e+02, 5.20057432e+05, 1.28551555e+05, 1.25288212e+05,
8.11800261e+04, 1.40052558e+06, 9.64441464e+05, 2.08724844e+06,
9.40242662e+05, 2.61656013e+04, 1.32967749e+05, 1.22127952e+05,
1.23733666e+05, 3.60411360e+03, 2.00470602e+04, 4.04207652e+03,
1.10741091e+04, 5.54534985e+02, 5.58959484e+01, 5.58399196e+00,
3.91053138e+01, 2.73404280e+01, 0.00000000e+00, 1.59311017e+01,
2.25706295e+01, 1.01049830e+02, 2.22757819e+02, 1.27038938e+01,
1.34606021e+01])
w_data2 = array([-2.61799388e+00, -2.09439510e+00, -1.57079633e+00, -1.04719755e+00,
-8.48230016e-01, -8.42994029e-01, -8.37758041e-01, -8.32522053e-01,
-8.27286065e-01, -8.22050078e-01, -8.16814090e-01, -8.11578102e-01,
-8.06342114e-01, -8.01106127e-01, -7.95870139e-01, -7.90634151e-01,
-5.23598776e-01, -2.61799388e-01, -4.44089210e-16, 2.61799388e-01,
3.14159265e-01, -4.44089210e-16, 2.61799388e-01, 5.23598776e-01,
8.11578102e-01, 8.16814090e-01, 8.22050078e-01, 8.27286065e-01,
8.32522053e-01, 8.37758041e-01, 8.42994029e-01, 8.48230016e-01,
8.53466004e-01, 8.58701992e-01, 1.04719755e+00, 1.30899694e+00,
1.57079633e+00, 1.62315620e+00, 1.62839219e+00, 1.63362818e+00,
1.63886417e+00, 1.64410016e+00, 1.64933614e+00, 1.65457213e+00,
1.65980812e+00, 1.66504411e+00, 1.67028009e+00, 1.67551608e+00,
1.68075207e+00, 1.68598806e+00, 1.69122405e+00, 1.69646003e+00,
1.70169602e+00, 1.91637152e+00, 2.09439510e+00, 2.61799388e+00])
PS_data2 = array([3.25876564e+00, 3.81409794e+00, 1.11474756e+03, 8.12689697e+02,
1.20040924e+04, 9.14052161e+03, 2.39588219e+04, 2.58246107e+05,
2.25913494e+05, 2.00792586e+04, 7.48461712e+04, 7.79328551e+04,
1.78038258e+04, 1.24493550e+04, 2.49796877e+03, 3.03134727e+03,
3.66156269e+02, 5.77324315e+00, 0.00000000e+00, 3.44609756e+01,
4.28032233e+01, 0.00000000e+00, 3.44609756e+01, 4.21229883e+02,
1.10116850e+05, 5.88833899e+04, 4.86932316e+03, 9.00693183e+05,
4.35660173e+04, 1.12480933e+05, 6.37520293e+05, 1.45519285e+05,
1.84586394e+04, 1.07850037e+05, 8.74590279e+03, 5.45370145e+03,
9.68703597e+01, 1.96921122e+05, 1.89929087e+06, 2.66404045e+05,
3.42301329e+05, 1.20854982e+06, 9.16540956e+05, 3.22951992e+06,
9.83937935e+05, 1.27623169e+05, 1.38875057e+05, 3.03198348e+05,
1.66383588e+05, 3.25265340e+04, 2.45382866e+03, 5.44667942e+04,
9.83374694e+04, 2.93687201e+02, 3.75552805e+00, 4.48805516e+00])
If I correctly understand your question the following function should do the job:
def identification(x, y, ratio=10., prominence=1e5, distance=10):
peaks, stats = signal.find_peaks(y, prominence=prominence, distance=distance)
order = np.flip(np.argsort(stats["prominences"]))
values = []
if len(peaks) == 1:
values.append(x[peaks[0]])
elif len(peaks) > 1:
for i, j in zip(order[:-1], order[1:]):
if stats["prominences"][i] >= ratio * stats["prominences"][j]:
values.append(x[peaks[i]])
return {
"values": values,
"order": order,
"peaks": peaks,
"stats": stats,
}
It scans for peaks, sort them by heights, then only keeps peaks which are ratio
times taller than the next.
Following calls follow your observations:
identification(w_data1, PS_data1)
# {'values': [-1.5812683],
# 'order': array([0]),
# 'peaks': array([15]),
# 'stats': {'prominences': array([2087228.8524879]),
# 'left_bases': array([0]),
# 'right_bases': array([30])}}
identification(w_data2, PS_data2)
# {'values': [],
# 'order': array([2, 1, 0]),
# 'peaks': array([ 7, 27, 43]),
# 'stats': {'prominences': array([ 258242.84823436, 900596.3126403 , 3229516.16447195]),
# 'left_bases': array([ 0, 21, 21]),
# 'right_bases': array([18, 36, 54])}}