Search code examples

How to divide a data set into three different vectors based on peaks

I have a large data set that, when graphed, resembles the graph of sin(x)+1 with three peaks. I want to integrate under each peak and get three different areas. I do not know the coordinate location of the peak, and I cannot assume that I know the wavelength. So I need to find the three peaks and separate the data into three corresponding vectors. Any help would be greatly appreciated.


  • You can accomplish what you want using the findpeaks function. Take the following example:

    We generate two vectors x and y of data:

    x = linspace(0, 5*pi);  % x data.
    y = sin(x) + 1;         % y data.

    Then we use findpeaks to find the peaks of our dataset and retrieve their indexes (locs):

    >> [~, locs] = findpeaks(y)
    locs =
        11    51    90

    We can see that the function has found 3 peaks with coordinates: [x(11), y(11)], [x(51), y(51)] and [x(90), y(90)].

    By calling findpeaks without output arguments we can get a plot of the data with the peak values overlaid which is often useful for a visual verification:

    >> findpeaks(y)

    signal with peaks overlaid

    We can divide our dataset very easily with the following for loop, and store the different subsets in a cell array:

    n = numel(locs);
    for i = 1:n + 1
        if i == 1
            x_cell{i} = x(1:locs(i));
            y_cell{i} = y(1:locs(i));
        elseif i <= n
            x_cell{i} = x(locs(i-1):locs(i));
            y_cell{i} = y(locs(i-1):locs(i));
            x_cell{i} = x(locs(i-1):end);
            y_cell{i} = y(locs(i-1):end);

    This will give us:

    K>> x_cell
    x_cell =
      1×4 cell array
        [1×11 double]    [1×41 double]    [1×40 double]    [1×11 double]


    K>> y_cell
    y_cell =
      1×4 cell array
        [1×11 double]    [1×41 double]    [1×40 double]    [1×11 double]

    So we have divided our dataset successfully. Each cell contains a subset of the original dataset.

    Now we can use trapz inside a for loop to find the numerical integration of each subset:

    k = numel(y_cell);
    for i = 1:k
        A(i) = trapz(x_cell{i}, y_cell{i});

    These are the results:

    >> A
    A =
        2.6004    6.4099    6.0931    2.6004

    Finally I thought it would be nice to plot the different regions together using the area function and a for loop:

    hold on;
    for i = 1:k
        area(x_cell{i}, y_cell{i}, 'FaceColor', i/k*[1, 1, 1]);
    hold off; axis tight;
    grid on; box on;

    The different regions are clearly visible here:

    area plot of different regions