I'm trying to scan and parse VBA (visual basic for application) code for a school assignment.
I'm using Python PLY lex and yacc modules. Right now I'm just trying to get variable declaration and assignments to work as a proof of concept. My preliminary grammar understands a variable declaration OR an assignment. As soon as I put a newline (\n) character in there and add another statement it doesn't understand anything. ie: In the code in the gist, if you remove "a = 3" from the string at line 92, it will work fine, and insert an Identifier in the identifier list.
I handle newline characters in the scanner, so I think there's something wrong with my grammar definition, but can't figure it out.
The grammar is basically this:
statement : declaration
| assignment
declaration : DIM IDENTIFIER AS TYPE
assignment : IDENTIFIER ASSIGN BOOLEAN
| IDENTIFIER ASSIGN DOUBLE
| IDENTIFIER ASSIGN INT
Note that IDENTIFIER, ASSIGN, BOOLEAN, DOUBLE, INT DIM, AS and TYPE are all tokens defined in the lex module.
I've created a gist with the code which is at: https://gist.github.com/clsk/22c386695dd1ddb7ca75
@rici wrote:
That's a grammar for a single statement. Why do you expect it to work with multiple statements? Nowhere is there a production which indicate that multiple statements are legal input.
The OP wrote:
Indeed that was the issue. Thanks for the pointer