I'm trying to decode a uriencoded bit of form data in Perl (the encoded data is %25admin
which should decode to %admin
). I'm using some long repeated, simple regex to do it:
$value =~ tr/+/ /;
$value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
$value =~ s///g;
This set of regex has served me well for years and usually does just fine, but in this case, it is outputting min
("%ad" is missing from the decoded string as though it were part of the escaped character). What am I missing that it is causing it to interpret the characters %25ad
as a single escaped character rather than %25
as the escaped character and ad
as independent of it?
$value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
This successfully converts %25admin
to %admin
which is actually the result you want. But for some unknown reason you then do another substitute with an empty pattern:
$value =~ s///g;
This empty pattern has a special meaning. From perldoc perlop:
The empty pattern //
If the PATTERN evaluates to the empty string, the last successfully matched regular expression is used instead.
The last successfully matched regular expression is in the line above, so this statement essentially means:
$value =~ s/%([a-fA-F0-9][a-fA-F0-9])//g;
This matches %admin
and results in min
.