Search code examples
scalaapache-spark-sqltreenodeself-type

How to explain TreeNode type restriction and self-type in Spark's TreeNode?


From the definition of TreeNode in Spark SQL:

abstract class TreeNode[BaseType <: TreeNode[BaseType]] extends Product {
  self: BaseType =>
  ...
}

What does it say about the subtypes of TreeNode and BaseType? What's acceptable?


Solution

  • Self-types

    First, have a look at so called self-types: The blog of Andrew Rollins gives a nice intro on Self Type Annotations vs. Inheritance.

    Basically, a self type, written as

    trait Foo { self: SomeType =>
      ...
    }
    

    says, the trait Foo can only be mixed in a class which also implements SomeType. The difference between inheritance and self-type is also nicely explained here.

    Often, self-types are used for Dependency Injection, such as in the Cake Pattern.

    Type-restrictions

    Given the type definition:

    class TreeNode[BaseType <: TreeNode[BaseType]] {
      self: BaseType with Product =>
      // ...
    }
    
    1. The definition TreeNode[BaseType <: TreeNode[BaseType]] says: TreeNode is typed such that the type parameter BaseType is at least (in the sense of sub-classing) also a TreeNode[BaseType]. Speaking roughly that means: The type-parameter must also be a TreeNode itself.

    2. The self-type here demands, that a subclass of TreeNode is only "allowed" if it also provides a Product.

    Concrete examples

    Example 1

    class IntTreeNode extends TreeNode[Int] {}
    

    does not compile due to:

    1. The type argument Int does not conform to class TreeNode's type parameter bounds, i.e. [BaseType <: TreeNode[BaseType]]
    2. Illegal inheritance due to self-type restriction.

    Example 2

    class IntTreeNode2 extends TreeNode[IntTreeNode2]
    

    does not compile due to:

    1. Illegal inheritance due to self-type restriction.

    Example 3

    class TupleTreeNode extends TreeNode[TupleTreeNode] with Product1[Int] {
      // implementation just to be a `Product1` 
      override def _1: Int = ???
      override def canEqual(that: Any): Boolean = ???
    }
    

    does compile due to:

    1. type constraint on BaseType and self-type are both fulfilled. So, this is what the original definition requires.

    Remark

    Also an example similar to yours (Catalyst) is given on docs.scala-lang.org