Search code examples
rubypattern-matchingglobstring-matching

Wildcard string matching in Ruby


I'd like to write a utility function/module that'll provide simple wildcard/glob matching to strings. The reason I'm not using regular expressions is that the user will be the one who'll end up providing the patterns to match using some sort of configuration file. I could not find any such gem that's stable - tried joker but it had problems setting up.

The functionality I'm looking for is simple. For example, given the following patterns, here are the matches:

pattern | test-string         | match
========|=====================|====================
*hn     | john, johnny, hanna | true , false, false     # wildcard  , similar to /hn$/i
*hn*    | john, johnny, hanna | true , true , false     # like /hn/i
hn      | john, johnny, hanna | false, false, false     # /^hn$/i
*h*n*   | john, johnny, hanna | true , true , true
etc...

I'd like this to be as efficient as possible. I thought about creating regexes from the pattern strings, but that seemed rather inefficient to do at runtime. Any suggestions on this implementation? thanks.

EDIT: I'm using ruby 1.8.7


Solution

  • I don't see why you think it would be inefficient. Predictions about these sorts of things are notoriously unreliable, you should decide that it is too slow before you go bending over backwards to find a faster way. And then you should profile it to make sure that this is where the problem lies (btw there is an average of 3-4x speed boost from switching to 1.9)

    Anyway, it should be pretty easy to do this, something like:

    class Globber 
      def self.parse_to_regex(str)
        escaped = Regexp.escape(str).gsub('\*','.*?')
        Regexp.new "^#{escaped}$", Regexp::IGNORECASE
      end
    
      def initialize(str)
        @regex = self.class.parse_to_regex str
      end
    
      def =~(str)
        !!(str =~ @regex)
      end
    end
    
    
    glob_strs = {
      '*hn'    => [['john', true, ], ['johnny', false,], ['hanna', false]],
      '*hn*'   => [['john', true, ], ['johnny', true, ], ['hanna', false]],
      'hn'     => [['john', false,], ['johnny', false,], ['hanna', false]],
      '*h*n*'  => [['john', true, ], ['johnny', true, ], ['hanna', true ]],
    }
    
    puts glob_strs.all? { |to_glob, examples|
      examples.all? do |to_match, expectation|
        result = Globber.new(to_glob) =~ to_match
        result == expectation
      end
    }
    # >> true