Search code examples
ruby

Ruby iterator with a function. Return the first value from function without iterating the whole list


I have a array: arr=[x1, x2, x3...] and a function that returns a value based on the first x in arr that the function is truthy.

Essentially:

# my_x is the return of func() 
# with the first x in arr that func(x) is true
# and the entire arr array is not processed.

my_x=arr.ruby_magic{|x| func(x) } 

my_x should be equal to first true value return of func(x)

Suppose each X in arr is a regex pattern. Without having to run every regex, I want to return the caputure group from the first match.

In Python, I would write a generator with next. It will run each predicate until a truthy return then hand that value to m. If no truthy return, None is used as a default but that default could be anything:

import re 

patterns=[r"no match", r": (Value.*?pref)", r": (Value.*)", r"etc..."]

s=""" 
This is the input txt
This is a match if the other is not found: Value 1

This is the match I am looking for first: Value 1 pref

Last line.
"""

val_I_want=next(
        (m.group(1) for p in patterns 
            if (m:=re.search(rf'{p}', s))), None)

I have not found the equivalent in Ruby.

I could do an explicite loop:

# s in the same multiline string as above...

patterns=[/no match/, /: (Value.*?pref)/, /: (Value.*)/,/etc.../]

val_I_want=nil 
patterns.each{|p| 
    m=p.match(s)
    if m then
        val_I_want=m[1]
        break 
    end     
}
# val_I_want is either nil or 
# the first match capture group that is true

That is the functionality I want but that seems kinda wordy in comparison to the the Python generator.

I have tried grep with the first value being a predicate. But the probelm here is entire result array is generated prior to next being used:

patterns.grep(proc {|p| p.match(s)}) {|m| m.match(s)[1]}.to_enum
# can then use .next on that.
#BUT it runs though the entire array when all I want is the first

#<Enumerator: ["Value 1 pref", "Value 1"]:each>

I tried find but that returns the first pattern that is true, not the capture group:

> e=patterns.find{|p| p.match(s) }
=> /: (Value.*?pref)/

# Now I would have to rerun match with the pattern found to get the text

Ideas?


Thank you so much for the helpful ideas. I learned several new things in my Ruby kit bag.

After looking, trying several I think the best for me is combining Dogbert's lazy.filter_map with Stefans suggestion of s[regex, 1] for this:

val_I_want=patterns.lazy.filter_map { |p| s[p, 1] }.first

Interestingly, the syntax s[p, 1] does NOT support dynamic regexes inside the [] operator without parenthesis like so (Regexp.new "#{p.to_s}(.*)") which takeaway from the attractiveness.

I ended up using:

patterns.lazy.filter_map { |p| card.match("#{p}(.*)")&.[](1) }.first

But this works too:

patterns.find{ |p| m = card.match("#{p}(.*)") and break m[1] }

In a more general case, you can do:

def func(x)
  # silly function for show
  x*x
end     

arr=[1,3,5,6,7,8,9]

p arr.lazy.filter_map { |x| (fx=func(x))>30 ? [x,fx] : nil }.first
# [6, 36]

And a very honorable mention to engineersmnky's modification of my .find attempt:

val_I_want = patterns.find {|p| m = p.match(s) and break m[1] }
   

Solution

  • You can use .lazy.filter_map { .. }.first. This will not run the block for elements after the first truthy value is found.

    irb> [1, 2, 3, 4, 5].lazy.filter_map { |x| p x; x > 3 ? x * 2 : nil }.first
    1
    2
    3
    4
    => 8
    

    This will return x * 2 for the first x that's greater than 3. I added p x; to show that this code doesn't process the 5th element of the list.


    The Regex example:

    irb> regexes = [/(1)/, /(2)/, /(3)/]
    => [/(1)/, /(2)/, /(3)/]
    irb> regexes.lazy.filter_map { |regex| p regex; regex.match("2")&.[](1) }.first
    /(1)/
    /(2)/
    => "2"
    

    Using String[Regexp, Integer] syntax as suggested by @Stefan in a comment below:

    regexes.lazy.filter_map { |regex| p regex; string[regex, 1] }.first
    

    Demo:

    irb> regexes = [/(1)/, /(2)/, /(3)/]
    => [/(1)/, /(2)/, /(3)/]
    irb> string = "2"
    => "2"
    irb> regexes.lazy.filter_map { |regex| p regex; string[regex, 1] }.first
    /(1)/
    /(2)/
    => "2"
    irb> string = "4"
    => "4"
    irb> regexes.lazy.filter_map { |regex| p regex; string[regex, 1] }.first
    /(1)/
    /(2)/
    /(3)/
    => nil