Search code examples
regexperlparsinggreedyregex-greedy

How can I fix my regex to not match too much with a greedy quantifier?


I have the following line:

"14:48 say;0ed673079715c343281355c2a1fde843;2;laka;hello ;)"

I parse this by using a simple regexp:

if($line =~ /(\d+:\d+)\ssay;(.*);(.*);(.*);(.*)/) {
    my($ts, $hash, $pid, $handle, $quote) = ($1, $2, $3, $4, $5);
}

But the ; at the end messes things up and I don't know why. Shouldn't the greedy operator handle "everything"?


Solution

  • The greedy operator tries to grab as much stuff as it can and still match the string. What's happening is the first one (after "say") grabs "0ed673079715c343281355c2a1fde843;2", the second one takes "laka", the third finds "hello " and the fourth matches the parenthesis.

    What you need to do is make all but the last one non-greedy, so they grab as little as possible and still match the string:

    (\d+:\d+)\ssay;(.*?);(.*?);(.*?);(.*)