Search code examples
scalatypecheckingscala-macros

Scala macros: What is the difference between typed (aka typechecked) and untyped Trees


I'm getting started with scala macros, they're awesome, but I'm running into the difference between typed (aka typechecked) and untyped Trees.

For example, you can't call c.eval with a typechecked Tree for some reason. I can't find documentation on this 'typechecked' in the scala macros documentation (I know they're still working on that, this might be something that needs to be added some day).

What does it mean for a Tree to be typechecked? Why are they so different that apparently c.eval can't deal with typechecked Trees (the inverse would make more sense to me).

I guess this is probably compiler 101, but I didn't take that course :( Any explanation or pointer to articles/documenation would be appreciated!


Solution

  • Theoretical part

    This is an architectural peculiarity of scalac that started leaking into the public API once we exposed internal compiler data structures in compile-time / runtime reflection in 2.10.

    Very roughly speaking, scalac's frontend consists of a parser and a typer, both of which work with trees and produce trees as their result. However properties of these trees are quite different, which comes from the fact that trees produced by parser are unattributed (having their symbol field set to NoSymbol and their tpe field set to null), whereas trees produced by typer are attributed.

    Now you might wonder what difference this can make, because it's just symbol and tpe, right? However, in scalac it's more than just that. In order to do its job, typer changes the structure of the ASTs it's processing, destroying some original trees and producing some synthetic trees. Unfortunately, sometimes these transformations are irreversible, which means that if one typechecks a tree, and then erases all assigned attributes, the resulting tree isn't going to make sense anymore (https://issues.scala-lang.org/browse/SI-5464).

    Allright, but why would one want to erase (or in scalac parlance, reset, as in resetLocalAttrs or resetAllAttrs) attributes of typechecked trees? Well, this necessity stems from another implementation detail - symbols and their owner chains. Just a few days ago I've written up some details about that on scala-internals: https://groups.google.com/d/msg/scala-internals/rIyJ4yHdPDU/qS-7DreEbCwJ, but in a nutshell you can't typecheck a tree in some lexical context and then simply use it in a different lexical context (that's what is essentially needed for c.eval).

    So, to sum it up the state of the art in scalac tree management:

    1. Untyped trees (also known as parser trees or unattributed trees) are observationally different from typed trees (also known as typer trees, typechecked trees or attributed trees)
    2. There are two main differences between these two tree flavors: a) typed trees have symbols and types set by the typechecker, b) typed trees have somewhat different shapes.
    3. Usually, if some compiler API takes a tree, then both untyped and typed trees will do. However in some cases (one of which I outlined above), only untyped or only typed trees are appropriate.
    4. One can go from an untyped tree to a typed tree by calling Context.typecheck (compile-time reflection) or ToolBox.typecheck (runtime reflection), but going back from a typed tree to an untyped tree via resetLocalAttrs or resetAllAttrs is currently unreliable because of https://issues.scala-lang.org/browse/SI-5464.

    So, as you can see, our trees are quite capricious, which brings a great deal of complexity into metaprogramming in Scala.

    However, good news is that this complexity isn't dictated by some fundamental good reasons that originate in compiler 101. All the complexity is incidental, and we plan to evict it step by step until it's all gone. https://groups.google.com/forum/#!topic/scala-internals/TtCTPlj_qcQ (also posted a couple days ago) is the first step in this direction. Stay tuned for other goodies which might also arrive this year!

    Practical part

    After thoroughly scaring you by elaborating all the details and alluding to mysterious cases when nothing works, I would like to note that usually one doesn't need to know about this kind of stuff when using macros. Very often both untyped trees (manually constructured ASTs, quasiquotes) and typed trees (macro arguments) work just fine.

    In cases, when scalac wants a particular tree flavor, it would either tell you like c.eval or sometimes crash in your face (RefChecks, LambdaLift and GenICode crashes are huge indicators that trees got mixed up during macro expansion - in those cases use resetLocalAttrs as described in https://groups.google.com/forum/#!msg/scala-internals/rIyJ4yHdPDU/qS-7DreEbCwJ). Fixing this is my top priority, and I'm working on it right now. It might so happen that the fixes will make it into 2.11.0, and this answer will become obsolete very soon :)