Search code examples
scalacompiler-constructioncode-translationintermediate-languagescala-compiler

Scala compiler output after cleanup phase


I would like to develop a tool that post-processes a scala program once all the heavy lifting has been completed by the Scala compiler. From what I understand the different phases of the Scala compiler incrementally simplify the program in terms of its syntactic sugars and advanced features like lambdas, closures, pattern-matching etc. However, I notice that what comes out of the so-called cleanup phase - which is the last phase before code-generation - looks like scala but it is not really scala.

Does anyone know personally or can point me to a resource that can help me understand the language that comes out of the cleanup phase ?

To give you an example, in the output of the cleanup phase I see things like:

case <synthetic> val x1: Foo$Bar = l;
  case9(){
    if (...some condition...)
      matchEnd8(scala.Predef.Set().empty())
    else
      case10()
  };

My hypothesis is that this is the result of translating pattern matching but it does not look like valid scala syntax as far as I understand (I am not an experienced Scala developer at all!).

I guess it all comes down to this: is it possible to convert the output of the cleanup phase to valid - compilable - scala code in general ?


Solution

  • In general, at any stage in the scalac compiler (even right after parsing), the internal representation used by the compiler is not valid Scala code anymore. That is essentially because of the existence of labels and gotos, which you discovered.

    A structure of the form

    labelName(...params){
      ...
    }
    

    is a label definition, and a call of the form

    labelName(...args)
    

    is a jump to that label, assigning the ...args to the ...params.

    Labels and gotos are used by scalac (and dotc, but with a different representation) to represent while and do..while loops (immediately after parsing), the translation of matches and the tail-recursive-optimized functions.

    In general, there is no way to go back from the internal representation to valid Scala code, especially so far in the pipeline as after cleanup.