Search code examples
javajava-8functional-programmingfunctional-interface

In Java why is Function.identity() a static method instead of something else?


Java 8 added functional programming constructs, including the Function class and its associated identity() method.

Here's the current structure of this method:

// Current implementation of this function in the [JDK source][1]
static <T> Function<T, T> identity() {
    return t -> t;
}

// Can be used like this
List<T> sameList = list.stream().map(Function.identity()).collect(Collectors.toList());

However, there's a second way to structure it:

// Alternative implementation of the method
static <T> T identity(T in) {
    return in;
}

// Can be used like this
List<T> sameList = list.stream().map(Function::identity).collect(Collectors.toList());

There's even a third way to structure it:

// Third implementation
static final Function<T, T> IDENTITY_FUNCTION = t -> t;

// Can be used like this
List<T> sameList = list.stream().map(Function.IDENTITY_FUNCTION).collect(Collectors.toList());

Of the three approaches, the first one that is actually used in the JDK looks less memory efficient, as it appears to be creating a new object (lambda) on every use, while the second and third implementations don't. According to this SO answer that's not actually the case, so ultimately all three approaches seem relatively equivalent performance-wise.

Using the second approach allows the method to be used as a method reference, which is similar to how many other standard library methods are used in functional constructs. E.g. stream.map(Math::abs) or stream.map(String::toLowerCase).

Overall, why use the first approach, which looks (though ultimately isn't) less performant and is different from other examples?


Solution

  • TL;DR Using Function.identity() creates only one object, so it's very memory efficient.


    Third implementation doesn't compile, because T is undefined, so that's not an option.

    In second implementation, every time you write Function::identity a new object instance is created.

    In first implementation, whenever you call Function.identity(), an instance to the same lambda object is returned.

    It is simple to see for yourself. Start by creating the two identity methods in the same class, so rename them to identity1 and identity2 to keep them separately identifiable.

    static <T> Function<T, T> identity1() {
        return t -> t;
    }
    
    static <T> T identity2(T in) {
        return in;
    }
    

    Write a test method that accepts a Function and prints the object, so we can see it's unique identity, as reflected by the hash code.

    static <A, B> void test(Function<A, B> func) {
        System.out.println(func);
    }
    

    Call the test method repeatedly to see if each one gets a new object instance or not (my code is in a class named Test).

    test(Test.identity1());
    test(Test.identity1());
    test(Test.identity1());
    test(Test::identity2);
    test(Test::identity2);
    for (int i = 0; i < 3; i++)
        test(Test::identity2);
    

    Output

    Test$$Lambda$1/0x0000000800ba0840@7adf9f5f
    Test$$Lambda$1/0x0000000800ba0840@7adf9f5f
    Test$$Lambda$1/0x0000000800ba0840@7adf9f5f
    Test$$Lambda$2/0x0000000800ba1040@5674cd4d
    Test$$Lambda$3/0x0000000800ba1440@65b54208
    Test$$Lambda$4/0x0000000800ba1840@6b884d57
    Test$$Lambda$4/0x0000000800ba1840@6b884d57
    Test$$Lambda$4/0x0000000800ba1840@6b884d57
    

    As you can see, multiple statements calling Test.identity1() all get the same object, but multiple statements using Test::identity2 all get different objects.

    It is true that repeated executions of the same statement gets the same object (as seen in result from the loop), but that's different from result obtained from different statements.

    Conclusion: Using Test.identity1() creates only one object, so it's more memory efficient than using Test::identity2.