Search code examples
matlabvectorcurve-fittingdata-manipulationdata-fitting

How to remove data points from a data set in Matlab


In Matlab, I have a vector that is a 1x204 double. It represents a biological signal over a certain period of time and over that time the signal varies - sometimes it peaks and goes up and sometimes it remains relatively small, close to the baseline value of 0. I need to plot this the reciprocal of this data (on the xaxis) against another set of data (on the y-axis) in order to do some statistical analysis.

The problem is that due to those points close to 0, for e.g. the smallest point I have is = -0.00497, 1/0.00497 produces a value of -201 and turns into an "outlier", while the rest of the data is very different and the values not as large. So I am trying to remove the very small values close to 0, from the data set so that it does not affect 1/value.

I know that I can use the cftool to remove those points from the plot, but how do I get the vector with those points removed? Is there a way of actually removing the points? From the cftool and removing those points on the original, I was able to generate the code and find out which exact points they are, but I don't know how to create a vector with those points removed.

Can anyone help?

I did try using the following for loop to get it to remove values, with 'total_BOLD_time_course' being my signal and '1/total_BOLD_time_course' is what I want to plot, but the problem with this is that in my if statement total_BOLD_time_course(i) = 1, which is not exactly true - so by doing this the points still exist in the vector but are now taking the value 1. But I just want them to be gone from the vector.

for i = 1:204 
  if total_BOLD_time_course(i) < 0 && total_BOLD_time_course(i) < -0.01
   total_BOLD_time_course(i) = 1;

  else if total_BOLD_time_course(i) > 0 && total_BOLD_time_course(i) < 0.01
     total_BOLD_time_course(i) = 1 ;
  end
 end
end

Solution

  • To remove points from an array, use the syntax

    total_BOLD_time_course( abs(total_BOLD_time_course<0.01) ) = nan
    

    that makes them 'blank' on the graph, and ignored by further calculations, but without destroying the temporal sequence of the datapoints.

    If actually destroying timepoints is not a concern then do

    total_BOLD_time_course( abs(total_BOLD_time_course<0.01) ) = []
    

    Then there'll be fewer data points, and they won't map on to any other time_course you have. But the advantage is that it will "close up" the gaps in the graph.

    -- PS

    note that in your code, the phrase

    x<0 && x<-0.01
    

    is redundant because if any number is less than -0.01, it is automatically less than 0. I believe the first should be x>0, and then your code is fine.