Update: Corrected code added below
I have a Leanpub flavored markdown* file named sample.md
I'd like to convert its code blocks into Github flavored markdown style using Raku Regex
Here's a sample **ruby** code, which
prints the elements of an array:
{:lang="ruby"}
['Ian','Rich','Jon'].each {|x| puts x}
Here's a sample **shell** code, which
removes the ending commas and
finds all folders in the current path:
{:lang="shell"}
sed s/,$//g
find . -type d
In order to capture the lang
value, e.g. ruby
from the {:lang="ruby"}
and convert it into
```ruby
I use this code
my @in="sample.md".IO.lines;
my @out;
for @in.kv -> $key,$val {
if $val.starts-with("\{:lang") {
if $val ~~ /^{:lang="([a-z]+)"}$/ { # capture lang
@out[$key]="```$0"; # convert it into ```ruby
$key++;
while @in[$key].starts-with(" ") {
@out[$key]=@in[$key].trim-leading;
$key++;
}
@out[$key]="```";
}
}
@out[$key]=$val;
}
The line containing the Regex gives Cannot modify an immutable Pair (lang => True) error.
I've just started out using Regexes. Instead of ([a-z]+)
I've tried (\w)
and it gave the Unrecognized backslash sequence: '\w'
error, among other things.
How to correctly capture and modify the lang
value using Regex?
my @in="sample.md".IO.lines;
my \[email protected];
my @out;
my $k = 0;
while ($k < len) {
if @in[$k] ~~ / ^ '{:lang="' (\w+) '"}' $ / {
push @out, "```$0";
$k++;
while @in[$k].starts-with(" ") {
push @out, @in[$k].trim-leading;
$k++; }
push @out, "```";
}
push @out, @in[$k];
$k++;
}
for @out {print "$_\n"}
TL;DR
TL? Then read @jjemerelo's excellent answer which not only provides a one-line solution but much more in a compact form ;
DR? Aw, imo you're missing some good stuff in this answer that JJ (reasonably!) ignores. Though, again, JJ's is the bomb. Go read it first. :)
There are many dialects of regex. The regex pattern you've used is a Perl regex but you haven't told Raku that. So it's interpreting your regex as a Raku regex, not a Perl regex. It's like feeding Python code to perl
. So the error message is useless.
One option is to switch to Perl regex handling. To do that, this code:
/^{:lang="([a-z]+)"}$/
needs m :P5
at the start:
m :P5 /^{:lang="([a-z]+)"}$/
The m
is implicit when you use /.../
in a context where it is presumed you mean to immediately match, but because the :P5
"adverb" is being added to modify how Raku interprets the pattern in the regex, one has to also add the m
.
:P5
only supports a limited set of Perl's regex patterns. That said, it should be enough for the regex you've written in your question.
If you want to use a Raku regex you have to learn the Raku regex language.
The "spirit" of the Raku regex language is the same as Perl's, and some of the absolute basic syntax is the same as Perl's, but it's different enough that you should view it as yet another dialect of regex, just one that's generally "powered up" relative to Perl's regexes.
To rewrite the regex in Raku format I think it would be:
/ ^ '{:lang="' (<[a..z]>+) '"}' $ /
(Taking advantage of the fact whitespace in Raku regexes is ignored.)
After fixing the regex, one encounters other problems in your code.
The first problem I encountered is that $key
is read-only, so $key++
fails. One option is to make it writable, by writing -> $key is copy ...
, which makes $key
a read-write copy of the index passed by the .kv
.
But fixing that leads to another problem. And the code is so complex I've concluded I'd best not chase things further. I've addressed your immediate obstacle and hope that helps.