Search code examples
regexperlquantifiers

Perl REGEX Question


As a PHP programmer new to Perl working through 'Programming Perl', I have come across the following regex:

/^(.*?): (.*)$/;

This regex is intended to parse an email header and insert it into a hash. The email header is contained in a seperate .txt file and is in the following format:

From: person@site.com
To: email@site.com
Date: Mon, 1st Jan 2000 09:00:00 -1000
Subject: Subject here

The entire code I am using to work with this example regex is as follows:

use warnings;
use strict;

my %fields = ();

open(FILE, 'header.txt') or die('Could not open.');

while(<FILE>)
{
    /^(.*?): (.*)$/;
    $fields{$1} = $2;
}

foreach(%fields)
{
    print;
    print "\n";
}

Now, onto my question. I am unsure as to why the first subpattern has been modified to use a minimal quantifier. It is perhaps a small point to get hung up with, but I cannot see why it has been done.

Thanks for any replies.


Solution

  • If it hadn't, there is a risk that it wouldn't match correctly if the value contains :<space>.

    Imagine:

    Subject: Urgent: Need a regex
    

    Without the minimal match $1 would get Subject: Urgent, and $2 would be Need a regex.