Search code examples
rubysortingfilter

Is there better way to do this filter and sort together in Ruby?


(a) There're lines silimiar as below

......, start Mon 10/30 10:08
......, start Thu 12/21 9:21

(b) What I want to do: sort the lines by date and time, but want remove lines that start from today

(c) below is the Ruby codes to to implement this function

time = Time.new

$mon = time.month
$mday = time.day

# ......

array_tmp = results_all.lines.reject do |x|
    times = x.split(/,/)[1].scan(/\d+/).map(&:to_i)
    times[0] == $mon &&  times[1] == $mday
end

array_tmp.sort_by {|x| x.split(/,/)[1].scan(/\d+/).map(&:to_i)}]

My question is:

Is there better and elegant way to do this filter and sort at same time in Ruby?


Solution

  • Suppose we are given:

    arr = [
      "has , start Mon 10/30 10:08",
      "dog , start Thu 9/24 4:08",4
      "fleas , start Thu 12/21 9:21",
      "Saffi , start Thu 10/29 19:33",
      "My , start Thu 9/7 9:54"
    ]
    
    today = Date.today
      #=> #<Date: 2023-12-21 ((2460300j,0s,0n),+0s,2299161j)>
    

    One may then write:

    RGX = /
          \d{1,2}   # match 1 or 2 digits
          \/        # match a forward slash
          \d{1,2}   # match 1 or 2 digits
          [ ]+      # match 1 or more spaces
          \d{1,2}   # match 1 or 2 digits
          :         # match a colon
          \d{2}     # match two digits
          $         # match the end of the string
          /x        # invoke free-spacing regex definition mode
    
    arr.filter_map do |s|
      dt = string_to_datetime(s)
      [dt, s] unless dt.to_date == today 
    end.sort.map(&:last)
      #=> ["My , start Thu 9/7 9:54",
      #    "dog , start Thu 9/24 4:08",
      #    "Saffi , start Thu 10/29 19:33",
      #    "has , start Mon 10/30 10:08"]
    
    def string_to_datetime(str)
      DateTime.strptime(str[RGX], '%m/%d %H:%M')
    end
    

    The intermediate calculation

    arr.filter_map do |s|
      dt = string_to_datetime(s)
      [dt, s] unless dt.to_date == today 
    end
      #=> [[#<DateTime: 2023-10-30T10:08:00+00:00 ((2460248j,36480s,0n),+0s,2299161j)>,
      #     "has , start Mon 10/30 10:08"],
      #    [#<DateTime: 2023-09-24T04:08:00+00:00 ((2460212j,14880s,0n),+0s,2299161j)>,
      #     "dog , start Thu 9/24 4:08"],
      #    [#<DateTime: 2023-10-29T19:33:00+00:00 ((2460247j,70380s,0n),+0s,2299161j)>,
      #     "Saffi , start Thu 10/29 19:33"],
      #    [#<DateTime: 2023-09-07T09:54:00+00:00 ((2460195j,35640s,0n),+0s,2299161j)>,
      #     "My , start Thu 9/7 9:54"]]
    

    displays the array that is being sorted. Each element is a two-element array. The second element is one of the strings being sorted; the first is a DateTime instance computed from that string's month-day-time representation. The sorting is primarily done on the first element with the second element used only to break ties when two DateTime instances are equal. See the doc Array#sort.

    One could alternativly write:

    arr.reject do |s|
      string_to_datetime(s).to_date == today 
    end.sort_by { |s| string_to_datetime(s) }
      #=> ["My , start Thu 9/7 9:54",
      #    "dog , start Thu 9/24 4:08",
      #    "Saffi , start Thu 10/29 19:33",
      #    "has , start Mon 10/30 10:08"]
    

    which could expected to be somewhat faster than the first method (because the calculation done by sort_by's block is done just once, even though string_to_datetime is called twice). See Enumerable#sort_by.

    See also Date::today, Enumerable#filter_map, DateTime::strptime, String#[] and DateTime#to_date. See Time#strftime for formatting directives used by DateTime::strptime.