Search code examples
scalakotlinclojurejvm-languages

How are nested functions and lexical scope compiled in JVM languages?


As a concrete example for my question, here's a snippet in Python (which should be readable to the broadest number of people and which has a JVM implementation anyway):

def memo(f):
  cache = {}
  def g(*args):
    if args not in cache:
      cache[args] = f(*args)
    return cache[args]
  return g

How do industrial-strength languages compile a definition like this, in order to realize static scope? What if we only have nested definition but no higher order function-value parameters or return values, à la Pascal (and hence no need for closures)? I'm guessing that calculating static links is out, since you can't access the stack frame of a method call. So what is commonly done? Do they build anonymous inner classes? Lambda lifting? Something else?

Sorry if this is a question that's been asked before; it seems like it must be, but I haven't found anything that's quite right.


Solution

  • I will be answering your question from the standpoint of Clojure, the only JVM language whose translation strategy I know intimately. For concreteness, I have translated your Python to the following Clojure (not idiomatic or thread-safe, but this is not important here):

    (defn memo [f]
      (let [cache (atom {})]
        (fn g [& args]
          (when-not (contains? (@cache args))
            (swap! cache assoc args (apply f args)))
          (get @cache args))))
    

    Inner classes (mentioned in the question and the comments) are a convenience for programmers, and the compiler doesn't need them1. Each Clojure function definition (not function invocation!) corresponds to a single top-level class implementing clojure.lang.IFn (usually through some abstract helper class). In that class, each closed-over lexical variable is saved as a field; these are initialized in the constructor. So this function definition expands to something like:

    class memo extends AFunction {
      // static constants...
      public Object invoke(Object f) {
        Object cache = ...;
        return new memo$g__1723(cache);
      }
    }
    
    class memo$g__1723 extends RestFn {
      static Object swap_BANG_ = RT.var("clojure.core", "swap!");
      static Object assoc = RT.var("clojure.core", "assoc");
      static Object apply = RT.var("clojure.core", "apply");
      // ... more static constants for each function used ...
    
      Object f;
      Object cache;
    
      public memo$g__1723(Object f, Object cache) {
        this.f = f;
        this.cache = cache;
      }
    
      public int getRequiredArity() { return 0;}
    
      public Object applyTo(ISeq args) {
        Object cache = this.cache;
        if (/*...*/) {
          ((IFn)swap_BANG_).invoke(cache, assoc, args,
              ((IFn)apply).invoke(this.f, args));
        }
        return /*...*/;
      }
    }
    

    1In fact, in the version of Java that Clojure targets, inner classes don't exist at the JVM level - they are a fiction that the java compiler translates into separate top-level classes with secret access mechanisms, much as Clojure translates nested functions to top-level classes. In more recent versions of Java, the VM itself does actually understand nested classes.

    For completeness, the full disassembled bytecode for memo and its inner function follows below.

    $ javap -c -p 'tmp$memo' 'tmp$memo$g__1723'
    Compiled from "tmp.clj"
    public final class tmp$memo extends clojure.lang.AFunction {
      public static final clojure.lang.Var const__0;
    
      public tmp$memo();
        Code:
           0: aload_0
           1: invokespecial #9                  // Method clojure/lang/AFunction."<init>":()V
           4: return
    
      public static java.lang.Object invokeStatic(java.lang.Object);
        Code:
           0: getstatic     #15                 // Field const__0:Lclojure/lang/Var;
           3: invokevirtual #21                 // Method clojure/lang/Var.getRawRoot:()Ljava/lang/Object;
           6: checkcast     #23                 // class clojure/lang/IFn
           9: getstatic     #29                 // Field clojure/lang/PersistentArrayMap.EMPTY:Lclojure/lang/PersistentArrayMap;
          12: invokeinterface #32,  2           // InterfaceMethod clojure/lang/IFn.invoke:(Ljava/lang/Object;)Ljava/lang/Object;
          17: astore_1
          18: new           #34                 // class tmp$memo$g__1723
          21: dup
          22: aload_1
          23: aconst_null
          24: astore_1
          25: aload_0
          26: aconst_null
          27: astore_0
          28: invokespecial #37                 // Method tmp$memo$g__1723."<init>":(Ljava/lang/Object;Ljava/lang/Object;)V
          31: areturn
    
      public java.lang.Object invoke(java.lang.Object);
        Code:
           0: aload_1
           1: aconst_null
           2: astore_1
           3: invokestatic  #42                 // Method invokeStatic:(Ljava/lang/Object;)Ljava/lang/Object;
           6: areturn
    
      public static {};
        Code:
           0: ldc           #45                 // String clojure.core
           2: ldc           #47                 // String atom
           4: invokestatic  #53                 // Method clojure/lang/RT.var:(Ljava/lang/String;Ljava/lang/String;)Lclojure/lang/Var;
           7: checkcast     #17                 // class clojure/lang/Var
          10: putstatic     #15                 // Field const__0:Lclojure/lang/Var;
          13: return
    }
    Compiled from "tmp.clj"
    public final class tmp$memo$g__1723 extends clojure.lang.RestFn {
      java.lang.Object cache;
    
      java.lang.Object f;
    
      public static final clojure.lang.Var const__0;
    
      public static final clojure.lang.Var const__1;
    
      public static final clojure.lang.Var const__2;
    
      public static final clojure.lang.Var const__3;
    
      public static final clojure.lang.Var const__4;
    
      public tmp$memo$g__1723(java.lang.Object, java.lang.Object);
        Code:
           0: aload_0
           1: invokespecial #13                 // Method clojure/lang/RestFn."<init>":()V
           4: aload_0
           5: aload_1
           6: putfield      #15                 // Field cache:Ljava/lang/Object;
           9: aload_0
          10: aload_2
          11: putfield      #17                 // Field f:Ljava/lang/Object;
          14: return
    
      public java.lang.Object doInvoke(java.lang.Object);
        Code:
           0: getstatic     #23                 // Field const__0:Lclojure/lang/Var;
           3: invokevirtual #29                 // Method clojure/lang/Var.getRawRoot:()Ljava/lang/Object;
           6: checkcast     #31                 // class clojure/lang/IFn
           9: getstatic     #34                 // Field const__1:Lclojure/lang/Var;
          12: invokevirtual #29                 // Method clojure/lang/Var.getRawRoot:()Ljava/lang/Object;
          15: checkcast     #31                 // class clojure/lang/IFn
          18: aload_0
          19: getfield      #15                 // Field cache:Ljava/lang/Object;
          22: invokeinterface #37,  2           // InterfaceMethod clojure/lang/IFn.invoke:(Ljava/lang/Object;)Ljava/lang/Object;
          27: checkcast     #31                 // class clojure/lang/IFn
          30: aload_1
          31: invokeinterface #37,  2           // InterfaceMethod clojure/lang/IFn.invoke:(Ljava/lang/Object;)Ljava/lang/Object;
          36: invokeinterface #37,  2           // InterfaceMethod clojure/lang/IFn.invoke:(Ljava/lang/Object;)Ljava/lang/Object;
          41: dup
          42: ifnull        56
          45: getstatic     #43                 // Field java/lang/Boolean.FALSE:Ljava/lang/Boolean;
          48: if_acmpeq     57
          51: aconst_null
          52: pop
          53: goto          102
          56: pop
          57: getstatic     #46                 // Field const__2:Lclojure/lang/Var;
          60: invokevirtual #29                 // Method clojure/lang/Var.getRawRoot:()Ljava/lang/Object;
          63: checkcast     #31                 // class clojure/lang/IFn
          66: aload_0
          67: getfield      #15                 // Field cache:Ljava/lang/Object;
          70: getstatic     #49                 // Field const__3:Lclojure/lang/Var;
          73: invokevirtual #29                 // Method clojure/lang/Var.getRawRoot:()Ljava/lang/Object;
          76: aload_1
          77: getstatic     #52                 // Field const__4:Lclojure/lang/Var;
          80: invokevirtual #29                 // Method clojure/lang/Var.getRawRoot:()Ljava/lang/Object;
          83: checkcast     #31                 // class clojure/lang/IFn
          86: aload_0
          87: getfield      #17                 // Field f:Ljava/lang/Object;
          90: aload_1
          91: invokeinterface #55,  3           // InterfaceMethod clojure/lang/IFn.invoke:(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;
          96: invokeinterface #58,  5           // InterfaceMethod clojure/lang/IFn.invoke:(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;
         101: pop
         102: getstatic     #34                 // Field const__1:Lclojure/lang/Var;
         105: invokevirtual #29                 // Method clojure/lang/Var.getRawRoot:()Ljava/lang/Object;
         108: checkcast     #31                 // class clojure/lang/IFn
         111: aload_0
         112: getfield      #15                 // Field cache:Ljava/lang/Object;
         115: invokeinterface #37,  2           // InterfaceMethod clojure/lang/IFn.invoke:(Ljava/lang/Object;)Ljava/lang/Object;
         120: aload_1
         121: aconst_null
         122: astore_1
         123: invokestatic  #63                 // Method clojure/lang/RT.get:(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;
         126: areturn
    
      public int getRequiredArity();
        Code:
           0: iconst_0
           1: ireturn
    
      public static {};
        Code:
           0: ldc           #70                 // String clojure.core
           2: ldc           #72                 // String contains?
           4: invokestatic  #76                 // Method clojure/lang/RT.var:(Ljava/lang/String;Ljava/lang/String;)Lclojure/lang/Var;
           7: checkcast     #25                 // class clojure/lang/Var
          10: putstatic     #23                 // Field const__0:Lclojure/lang/Var;
          13: ldc           #70                 // String clojure.core
          15: ldc           #78                 // String deref
          17: invokestatic  #76                 // Method clojure/lang/RT.var:(Ljava/lang/String;Ljava/lang/String;)Lclojure/lang/Var;
          20: checkcast     #25                 // class clojure/lang/Var
          23: putstatic     #34                 // Field const__1:Lclojure/lang/Var;
          26: ldc           #70                 // String clojure.core
          28: ldc           #80                 // String swap!
          30: invokestatic  #76                 // Method clojure/lang/RT.var:(Ljava/lang/String;Ljava/lang/String;)Lclojure/lang/Var;
          33: checkcast     #25                 // class clojure/lang/Var
          36: putstatic     #46                 // Field const__2:Lclojure/lang/Var;
          39: ldc           #70                 // String clojure.core
          41: ldc           #82                 // String assoc
          43: invokestatic  #76                 // Method clojure/lang/RT.var:(Ljava/lang/String;Ljava/lang/String;)Lclojure/lang/Var;
          46: checkcast     #25                 // class clojure/lang/Var
          49: putstatic     #49                 // Field const__3:Lclojure/lang/Var;
          52: ldc           #70                 // String clojure.core
          54: ldc           #84                 // String apply
          56: invokestatic  #76                 // Method clojure/lang/RT.var:(Ljava/lang/String;Ljava/lang/String;)Lclojure/lang/Var;
          59: checkcast     #25                 // class clojure/lang/Var
          62: putstatic     #52                 // Field const__4:Lclojure/lang/Var;
          65: return
    }