Search code examples
regexperlmarkdownitalic

Substitute the markdown italic to html using regex in Perl


To convert the markdown italic text $script into html, I've written this:

my $script = "*so what*";
my $res =~ s/\*(.)\*/$1/g;
print "<em>$1</em>\n";

The expected result is:

<em>so what</em>

but it gives:

<em></em>

How to make it give the expected result?


Solution

  • Problems:

    • You print the wrong variable.
    • You switch variable names halfway through.
    • . won't match more than one character.
    • You always add one EM element, even if no stars are found.
    • You always add one EM element, even if multiple pairs of stars are found.
    • You add the EM element around the entire output, not just the portion in stars.

    Fix:

    $script =~ s{\*([^*]+)\*}{<em>$1</em>}g;
    print "$script\n";
    

    or

    my $res = $script =~ s{\*([^*]+)\*}{<em>$1</em>}gr;
    print "$res\n";
    

    But that's not it. Even with all the aforementioned problems fixed, your parser still has numerous other bugs. For example, it misapplies italics for all of the following:

    • **Important**
      Correct: Important
      Your code: *Important*
    • 4 * 5 * 6 = 120
      Correct: 4 * 5 * 6 = 120
      Your code: 4 5 6 = 120
    • 4 * 6 = 20 is *wrong*
      Correct: 4 * 6 = 20 is wrong
      Your code: 4 6 = 20 is wrong*
    • `foo *bar* baz`
      Correct: foo *bar* baz
      Your code: `foo bar baz`
    • \*I like stars\*
      Correct: *I like stars*
      Your code: \I like stars\