Search code examples
performanceperlstring-matching

Most efficient way to check if $string starts with $needle in perl


Given two string variables $string and $needle in perl, what's the most efficient way to check whether $string starts with $needle.

  • $string =~ /^\Q$needle\E/ is the closest match I could think of that does what is required but is the least efficient (by far) of the solutions I tried.
  • index($string, $needle) == 0 works and is relatively efficient for some values of $string and $needle but needlessly searches for the needle in other positions (if not found at the start).
  • substr($string, 0, length($needle)) eq $needle should be quite simple and efficient, but in most of my few tests is not more efficient than the previous one.

Is there a canonical way to do that in perl which I wouldn't be aware of or any way to optimise any of the above solutions?

(in my particular use case, $string and $needle are going to be different in each run, so precompiling a regexp is not an option).


Example of how to measure the performance of a given solution (here from a POSIX sh):

string='somewhat not so longish string' needle='somew'
time perl -e '
  ($n,$string,$needle) = @ARGV;
  for ($i=0;$i<$n;$i++) {

    index($string, $needle) == 0

  }' 10000000 "$string" "$needle"

With those values, index() performs better than substr()+eq with this system with perl 5.14.2, but with:

string="aaaaabaaaaabaaaaabaaaaabaaaaabaaaaab" needle="aaaaaa"

That's reversed.


Solution

  • rindex $string, $substring, 0
    

    searches for $substring in $string at position <=0 which is only possible if $substring is a prefix of $string. Example:

    > rindex "abc", "a", 0
    0
    > rindex "abc", "b", 0
    -1