Search code examples
rascal

accessing regexp subtree in parsetree


I have the following Rascal module:

module foo

import IO;
import ParseTree;
extend lang::std::Layout;

lexical CHAR = [ab];
start syntax CharList = CHAR hd (','  CHAR)+ tl ';';

My question is how to get to the individual elements of the tl part, after having parsed something. E.g.:

rascal>import foo;
ok

rascal>pt = parse(#start[CharList], "a, b;");
start[CharList]: `a, b;`
Tree: appl(...
rascal>pt.top.tl;
(',' CHAR)+: `, b`
Tree: appl(regular(...

Now how do I access the , b element? pt.top.tl[0] seems not to be the right way.

Thanks for any help.


Solution

  • Probably the easiest thing to do, unless you need to define lists in the way you've done above with a separate head and tail, is to use Rascal's built-in separated list construct, like so:

    start syntax CharList = {CHAR ","}+ chars ';';
    

    (If you need the separate head and tail, please see Jurgen's answer below. You can use this same notation there as well.)

    This defines a list of 1 or more (because of the +) comma-separated CHARs. If you need 0 or more you would instead use *. Rascal allows you to iterate over separated lists, so you can get the characters back like so:

    rascal> chars = [c | c <- pt.top.chars ];
    

    For the list in your example, this gives me back the following:

    list[CHAR]: [appl(
        prod(
          lex("CHAR"),
          [\char-class([range(97,98)])],
          {}),
        [char(97)])[
        @loc=|unknown:///|(0,1,<1,0>,<1,1>)
      ],appl(
        prod(
          lex("CHAR"),
          [\char-class([range(97,98)])],
          {}),
        [char(98)])[
        @loc=|unknown:///|(3,1,<1,3>,<1,4>)
      ]]
    

    You could also turn these into strings if you wanted to view them more easily or do something with their string values:

    rascal>charsAsStrings = ["<c>" | c <- pt.top.chars ];
    list[str]: ["a","b"]