Search code examples
javagenericswildcardtype-inferencemethod-reference

Method reference and type inference in java 8


I'm facing issue with method reference and type inference.

Parent class that returns T type :

public class Parent {
    public <T extends Parent> T get() {
        ...
    }
}

The following code fails :

List<Parent> parentList = new ArrayList<>();
List<Parent> collect = parentList.stream().map(Parent::get).collect(Collectors.toList()); // error

error msg

no instance(s) of type variable(s) exist so that Object conforms to Parent inference variable T has incompatible bounds: equality constraints: Parent lower bounds: Object

However, the following code success :

List<Parent> parentList = new ArrayList<>();
Function<Parent, Parent> func = Parent::get;
List<Parent> collect = parentList.stream().map(func).collect(Collectors.toList());

Why can't the compiler infer the return value of the get method in the Parent class, even though it's at least the Parent type?


Solution

  • Generics inference doesn't cover an infinite expanse of chained method invocations.

    Background required knowledge: Java inferencing burns the candle at both ends

    The compiler does both inside-out and outside-in type inference.

    Inside-out is the standard approach the compiler uses. Inside-out is inferring, say:

    List<String> x = someCollection.stream().map(String::toLowerCase).toList();
    

    By first determining what the type of someCollection is, then with that done, figure out what the type of that.stream() is, then what the type of that.map(String::toLowerCase) is, and so on.

    However, it also needs to apply outside-in, too. This comes up with lambdas and method references. In java, this on its own:

    String::toLowerCase;
    

    is meaningless, and indeed, Object o = String::toLowerCase; does not compile. Lambdas ((args) -> code;) are not allowed, nor are method refs (String::toLowerCase) allowed, unless the context tells the compiler which functional interface is required in the context where they appear. This requires that the compiler goes to australia and work outside-in. Add the fact that java allows method overloading and this song and dance routine between inside-out inferencing and outside-in inferencing gets quite complicated.

    This is the reason that java does not take either inside-out or outside-in inferencing to infinity. Because then it is fairly trivial to write code that would cause the compiler to take a year (literally) to finish execution.

    Where the inference fails

    Here's theoretically how java could have figured this one out, given:

    List<Parent> collect = parentList.stream().map(Parent::get).toList();
    

    (NB: I have simplified things by using stream's toList() method directly instead of collect(Collectors.toList() - not that this will make it work. Just, given that even that fails, it's easier to explain).

    Candle end 1: Outside-in

    Then the compiler would have to do some outside-in inferencing: Start with the notion that the result of .toList() is assigned to a variable of type List<Parent>, then notice that the sig of toList() is Stream<T>: List<T> toList(), we can lock in that T must therefore be Parent, and thus the receiver of the toList() call (the x in x.method() is called the 'receiver') must be a Stream<Parent>.

    Then, therefore, the output of .map(Parent::get) must be Stream<Parent>. The signature of map is:

     Stream<U>:  <R> Stream<R> map(Function<? super U, ? extends R> mapper);
    

    (Each layer has its own take on what the generics mean, so to avoid confusion I've renamed the 'T' being used here as U)

    So we lock in <R> as <Parent>, making that:

     Stream<U>: Stream<Parent> map(Function<? super U, ? extends Parent> mapper)
    

    And here this part of the process has to end, because we can't find a way to continue inferring types for the generics - what the heck is U? We can't tell. Let's just try Object, the lambda doesn't interpret correctly then (Parent::get does not 'fit' Function<Object, Parent>, thus, failure).

    Candle end 2: Inside-out

    So, we go to the other end.

    parentList is a List<Parent>, we know that. It has a stream() method whose sig is List<T>: Stream<T>, so, parentList.stream() is of type Stream<Parent>. Its map method is called, which looks like:

    Stream<T>: <R> Stream<R> map(Function<? super T, ? extends R> mapper);
    

    Locking in what we know, we can turn that into:

    <R> Stream<R> map(Function<? super Parent, ? extends R> mapper);
    

    And that's where it ends. We can't determine R from this end.

    You might think: Well, hold on! The mapping function is Parent::get, that.. tells you that R is <? extends Parent> simplify this to Parent and we can continue with the inference!

    Ah, but, in order to even interpret what Parent::get is supposed to mean, the compiler needs to know what functional interface is expected in its context, so, it cannot just have a go at checking out what Parent::get might be about before finishing this step, so, unfortunately, that's not an inference that the compiler can apply.

    Thus, the compiler does what it usually does and just tries Object. That actually works: Parent::get 'fits' Function<Parent, Object>. (because get() returns a Parent, which, certainly, is definitely also an Object, so that is fine). It then continues and you get as far as toList(), which is a List<Object>, which you then cannot assign to the variable. But, going back now (backtracking to the point where evidently a mistake was made) requires going back who knows how many steps, and javac isn't going to do that, so the error stays and that is exactly the error you end up seeing.

    Combining the two

    If we combine the two we can 'get there' - we can derive R by going outside-in and T by going inside-out, and with both determined, we can finish the job. However, that's not how far java inference can usually go - combining the results of the 2 approaches to determine a single type is too complicated. Java determines that using inside-out inference, it can't determine the signatures at the map() point (T can be figured out but R cannot), and java determines that using outside-in inference, it also can't determine that (R can be figured, but T cannot). Thus, map cannot be figured out, and java then tries one more time by just going with Object on all the things, that fails, and a compiler error is shown instead.

    How to fix it

    Give the compiler a hint so that the two approaches meet in a way that doesn't fall right in the middle of determining a single type.

    For example:

    List<Parent> collect = parentList.stream().<Parent>map(Parent::get).toList();
    

    (you can replace toList with collect(Collectors.toList()), it'll still work). This 'hint' lets the inside-out process fix R (Because you explicitly tell the compiler what R is: Parent). Your approach (first assign the mapping function to a variable of type Function<Parent, Parent>) trivially also accomplishes this goal.

    Important note about your snippet: Don't do that!

    public class Parent {
       public <T extends Parent> T get() {
           ...
       }
    }```
    

    I hope that's oversimplified for the purpose of asking this question, because as written, that code is useless / dangerous.

    You want:

    public Parent get() { ... }
    

    What you wrote is saying: "There is some type T that is not known to this method and cannot be known, not even at runtime. A caller picks a T. They can pick whatever they want, as long as they pick either Parent or something that is a subtype of Parent. This method guarantees that the value it returns fits this choice".

    This is not possible. Because you don't know what they picked. Whatever you return must work for any choice the caller makes, and you don't know what they choice they made. That means these are the only legitimate ways to write that method:

    • It returns null. Always. Because null is all types at once.
    • It never returns normally. All code paths through it end in throw, or, an endless loop, or, a JVM shutdown. By never having to actually come up with a value to return, you get away with guaranteeing that you return a value of a type you don't actually know and cannot know.

    Or, more likely, you're ugly-casting: Adding a (T) cast, and either accepting the compiler's dire warning about what you're doing or using @SuppressWarnings to get rid of that.