Search code examples
arraysrubyelementthresholdremove-if

How to remove all elements from sorted Ruby array which are closer to its closest neighbour than a limit?


I have a sorted array of real numbers in my Ruby program. I want to remove all the elements which are very "similar": their difference is smaller then a given limit. So finally I want to keep only those elements, which are well distinguishable from the others, the distinct elements: there are no other elements in the original array which are closer to them than the limit.

Currently I am experimenting with this two approach:

limit=0.5
vvs=vv.sort.reverse.each_cons(2).map{|a,b| (a-b).abs<limit ? nil : a}.compact

and

vvs=vv.each_cons(3).map{|a,b,c| (a-b).abs<limit && (b-c).abs<limit  ? nil : b}.compact

I need this method for my program which try to synchronize subtitles, and the values may contain some noise. Due to this fact I want to analyze only those distinct elements, which can be distinguished even when some additive noise is present.

My original real data from "Catch 22" https://pastebin.com/mRiS02mb


Solution

  • There seems to be some ambiguity in the question. I interpret it as I stated in a comment on the question.

    data = [ 3.42,  5.49,  6.12,  6.48,  7.11,  8.79,  9.36,
             9.54, 10.86, 10.95, 11.07, 13.08, 14.41, 14.92] 
    limit = 0.5
    
    ([-Float::INFINITY].concat(data) << Float::INFINITY).each_cons(3).
      select { |a,b,c| b-a >= 0.5 && c-b >= 0.5 }.
      map { |_,b,_| b }
      #=> [3.42, 5.49, 7.11, 8.79, 14.41, 14.92]