Search code examples
perlparse-recdescent

Parse::RecDescent - getting information from it


I'm working with the Parse::RecDescent parser in Perl, and I seem to have the most terrible time getting information from it. The information readily available online does not seem to have non-trivial examples.

Here is the code:

event_function: object_list ':' event_list ';'
        <defer:
        {       #item is a special character with Parse::Recdescent.
            print Dumper($item{object_list});
            $return = $item[1];
        }
        >
        | object_list ':' ';'
        <defer:
        { 
            print Dumper($item{object_list});
            $return = $item[1];
        }
        >

Here is the output

PS W:\developers\paulnathan\rd_dir> perl parser.pl testfile
$VAR1 = 4;
$VAR1 = 8;
PS W:\developers\paulnathan\rd_dir>

The input file parses correctly.

stuff, stuff2: pre-operation event = {foo1, foo2};

It should be outputting a hash keyed by "stuff", "stuff2".

Thoughts?

edit:

object_list : 
        object ',' object_list
        <defer:
        {

            my $retval = ();
            $retval = ::merge_hash_refs($item[1], $item[3]);

            $return = $retval;
        }
        >
        | object
        <defer:
        { 
            #print Dumper($item{object});
            $return = $item{object};
        }
        >       

    object : 
        '/' /[a-z0-9_][a-z0-9_]*/ '/' '...'
            <defer:
            {
                $::objects->{$item[2]} = "stuff";
                $return = $::objects;
            }
            >
        |  /[a-z0-9_][a-z0-9_]*/
            <defer:
            { 
                $::objects->{$item[1]} = "stuff";
                $return = $::objects;
            }
            >

edit2: Merge_hash_refs, just in case. :-)

#takes two hash references.
sub merge_hash_refs {
    my($ref1, $ref2) = @_;
    my $retref = ();
    while( my ($k, $v) = each %$ref1 ) {
        $retref->{$k} = $v;
    }
    while( my ($k, $v) = each %$ref2 ) {
        $retref->{$k} = $v;
    }

    return $retref;
}

Solution

  • If you add a use strict to your script you'll get the fatal error Can't use string ("1") as a HASH ref while "strict refs" in use at [the call to merge_hash_refs]. It appears that the closures created by the <defer> directives are causing the contents of @item to be the ones when the production matched instead of the hashrefs eventually returned by the subrules. Removing the <defer> directives gives me this output:

    $VAR1 = {
              'stuff2' => 'stuff',
              'stuff' => 'stuff'
            };
    

    Of course, this has the side effect that $::object is updated by successful object productions even if the higher level rules fail (including backtracking). I'd write it this way:

    use strict;
    use warnings;
    use Parse::RecDescent;
    use Data::Dumper;
    
    my $parser = Parse::RecDescent->new(<<'EOT');
    event_function: object_list ':' event_list(?) ';'
        {
            $return = $item[1];
        }
    
    object_list : <leftop: object ',' object>
        {
            $return = { map { %$_ } @{$item[1]} };
        }
    
    object : 
        '/' /[a-z0-9_][a-z0-9_]*/ '/' '...'
            {
                $return = { $item[2] => 'stuff' };
            }
        |  /[a-z0-9_][a-z0-9_]*/
            { 
                $return = { $item[1] => 'stuff' };
            }
    
    # stub, don't know what this should be
    event_list : /[^;]+/
    
    EOT
    
    my %object;
    
    while (<DATA>) {
        my $x = $parser->event_function($_);
    
        next unless $x;
    
        # merge objects into master list
        while (my ($k, $v) = each %$x) {
            $object{$k} = $v;
        }
    }
    
    print Dumper \%object;
    
    __DATA__
    stuff, stuff2: pre-operation event = {foo1, foo2};
    stuff3, stuff4: ;
    

    The output is:

    $VAR1 = {
              'stuff2' => 'stuff',
              'stuff3' => 'stuff',
              'stuff' => 'stuff',
              'stuff4' => 'stuff'
            };