Search code examples

Parsing tags from a file with Regexp::Grammars

I'm trying to capture free tags from comments in a program using Perl and the Regexp::Grammars CPAN module.

use strict;
use v5.10;
use YAML;

my $s = q{
      junk code;
      // here be tags #:tag1:
      junk code 2;
      // another one #:tag2:
      junk ...;

my $rg = do {
    use Regexp::Grammars;
        <nocontext: >  
        ^ .* <Tagger> .* $
        <rule: Tagger>         <[MATCH=single_tag]> +
        <token: single_tag>    \#\:<tag>\:
        <token: tag>           <matchline> \w+

if( $s =~ $rg ) {
    say Dump( \%/ );    
} else {
    say 'no match';

But the YAML output shows I'm only capturing the last tag:

  - tag:
      matchline: 5

How can I match all tags from the input data instead?

And... how can I get the tag's string matched without turning on noisy context strings (removing the nocontext: directive), so that the final result is somewhat more readable, ie:

  - tag: tag1
    matchline: 3
  - tag: tag2
    matchline: 5


  • Found it:

    my $rg = do {
        use Regexp::Grammars;
            <nocontext: >  
            <rule: Tagger>         <[MATCH=single_tag]>+  % (.*)
            <token: single_tag>    <matchline> \#\:<tag>\:
            <token: tag>           \w+

    Which yields the following YAML:

      - matchline: 3
        tag: tag1
      - matchline: 5
        tag: tag2