Search code examples
abstract-syntax-treerascal

Replacing types in AST rascal


I am trying to replace all types in an AST. Analyzing Java language using m3 model; definitions from here

If we take this code:

Int a = 1;

I am able to update the type of 1 to void for example. But I am not able to change the type of the variable itself.

I've included some example lines. Is someone able to point out the errors in the lines?

case \method(Type \return, str name, list[Declaration] parameters, list[Expression] exceptions)
    => \method(\int(), "funcHolder", parameters, exceptions)

case \type(Type \type) => \void()
case \type => \void

Solution

  • Ok, excellent question. First your code and the errors it might have:

    This looks good:

    case \method(Type \return, str name, list[Declaration] parameters, list[Expression] exceptions)
        => \method(\int(), "funcHolder", parameters, exceptions) 
    

    The definition is: data Declaration = \method(Type \return, str name, list[Declaration] parameters, list[Expression] exceptions, Statement impl); (see here), and your code follows exactly the definition. Every abstract method declaration in the ASTs you've parsed will match with this, since there is another \method declaration for methods with bodies with an additional argument.

    It may be that you do not have abstract method bodies in your example code, in that case this does nothing.

    A simpler version would also work fine:

    case \method(_, _, parameters, exceptions) => \method(\int(), "funcHolder", parameters, exceptions)
    

    The next one has issues:

    case \type(Type \type) => \void() 
    

    Because data Expression = \type(Type \type), that is an Expression and data Type = \void() that is a Type or data TypeSymbol = \void() it is a TypeSymbol the rewrite is not type-preserving and would do wrong things if the Rascal compiler would not detect this. Mostly it will probably not work for you because your example code does not contain this particular kind of expression. I suspect it might be the abstract notation for things such as int.class and Integer.class.

    Then this one is "interesting":

    case \type => \void() 
    

    In principle, if \type is not bound in the current scope, then this matches literally anything. But probably there is a function called \type or a variable or something somewhere, and thus this pattern tests for equality with that other thing in scope. Very nasty! It will not match with anything I would guess. BTW, we are planning a "Rascal Amendement Proposal" for a language change to avoid such accidental bindings of things in the scope of a pattern.

    Later from the commments I learned that the goal was to replace all instances of a Type in the AST by void(), to help in clone detection modulo type names. This is done as follows:

    case Type _ => \void() 
    

    We use a [TypedVariable] pattern, with the variable name _ to match any node of algebraic type Type and forget about the binding. That node will then be replaced by void().

    My way of working in the absence of a content-assist tool for pattern matching is as follows:

    1. find full one AST example of something you want to match, say Int a = 1;
    2. copy it into the Rascal source file
    3. remove the parts you want to abstract from by introducing variables or _
    4. test on the original example
    5. test on the rest of the system by printing out the loc that have matched and clicking on the locs to bring you to the code and see if it wasn't a false positive.

    for example I want to rewrite Int to void, I find an example of Int in an AST and paste it:

    visit (ast) {
       case simpleType(simpleName("Int")) => Type::\void()  // I added Type:: to be sure to disambiguate with TypeSymbol::void()
    }
    

    With some debugging code attached to print out all matches:

    visit (ast) {
       case e:simpleType(simpleName("Int")) => Type::\void()  
         when bprintln("found a type at <e.src?|unknown:///|>");
    }
    

    Maybe you find out that that has way too many matches, and you have to become more specific, so let's only change declarations Int _ = ....;, we first take an example:

    \variables(simpleType(simpleName("Int")), [...lots of stuff here...])
    

    and then we simplify it:

    \variables(simpleType(simpleName("Int")), names)
    

    and then include it in the visit:

    visit (ast) {
       case \variables(simpleType(simpleName("Int")), names) => \variables(Type::\void(), names) 
    }
    

    Final remark, as you can see you can nest patterns as deeply as you want, so any combination is fine. A "complex" example:

    \variables(/"Int", names)
    

    This pattern finds any variable declaration where the name "Int" is used somewhere as part of the type declaration. That's more loose than our original and might catch more than you bargained for. It's just to show what combinations you might want to do.

    A final example of this: find all variable declarations with a type name which starts with "Int" but could end with anything else, like "Integer" or "IntFractal", etc:

    \variables(simpleType(simpleName(/^Int/)), names)