Search code examples
regexperl

Should \Q...\E be used to escape meta character in the replacement of s///?


Are there any metacharacters that can appear in the replacement part of a substitution regexp, except for \1 and $1? When interpolating a variable, should I use \Q...\E?

Example:

my $str = 'foo bar baz';
$str =~ s/ba/\Q$benign_user_input\E/g;

Solution

  • I salute you for trying to avoid code injection bugs. Here's the question you must perpetually ask yourself to avoid that:

    Am I concatenating two different kinds of content together?

    If the answer is yes, you have a potential problem. If the result will be used as code, that's a potential code injection bug.

    Are you trying to concatenate arbitrary text with shell code? Potential code injection bug.

    Are you trying to concatenate arbitrary text with SQL code? Potential code injection bug.

    Are you trying to concatenate arbitrary text with HTML code? Potential code injection bug.

    Are you trying to concatenate arbitrary text with a regex pattern? Potential code injection bug.

    This last one is where \Q..\E comes in useful. It's used on the pattern side to convert text into a regex pattern that matches that text.


    Now let's put that to the test here.

    You have

    my $str = 'foo bar baz';
    $str =~ s/ba/$benign_user_input/g;
    

    This is equivalent to

    my $str = "foo " . $benign_user_input . "r " . $benign_user_input . "z";
    

    We have concatenation, so we must ask ourselves

    Am I concatenating two different kinds of content together?

    It's unclear what foo bar baz is, and it's unclear what $benign_user_input contains. So there is insufficient information to answer the question.

    For example, this would be a code injection bug:

    my $benign_user_input = "Document 1.pdf";
    
    my $shell_cmd = 'rm -- file';
    
    $shell_cmd =~ s/file/$benign_user_input/;
    

    But maybe not this:

    my $benign_user_input = "World";
    
    my $greeting = 'Hello, name!';
    
    $greeting =~ s/name/$benign_user_input/;