Search code examples
regexperlpattern-matchingcharacter-class

Trying to understand this perl regex bracketed character class?


Below is a script that I was playing with. With the script below it will print a

$tmp = "cd abc/test/.";
if ( $tmp =~ /cd ([\w\/\.])/ ) {
   print $1."\n";
}

BUT if I change it to:

$tmp = "cd abc/test/.";
if ( $tmp =~ /cd ([\w\/\.]+)/ ) {
   print $1."\n";
}

then it prints: cd abc/test/.

From my understanding the + matches one or more of the matching sequence, correct me if i am wrong please. But why in the first case it only matches a? I thought it should match nothing!!

Thank you.


Solution

  • You are correct. In the first case you match a single character from that character class, while in the second you match at least one, with as many as possible after the first one.

    First one :

    "
    cd\            # Match the characters “cd ” literally
    (              # Match the regular expression below and capture its match into backreference number 1
       [\w\/\.]       # Match a single character present in the list below
                         # A word character (letters, digits, etc.)
                         # A / character
                         # A . character
    )
    "
    

    Second one :

    "
    cd\            # Match the characters “cd ” literally
    (              # Match the regular expression below and capture its match into backreference number 1
       [\w\/\.]       # Match a single character present in the list below
                         # A word character (letters, digits, etc.)
                         # A / character
                         # A . character
          +              # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
    )
    "