Search code examples
javalistcollections

Is it necessary to copy a list to be safe?


The implementation (and documentation) of Stream.toList is like this:

Collections.unmodifiableList(new ArrayList<>(Arrays.asList(this.toArray())))

I'm wondering why the list returned by Arrays.asList needs to be copied to a new ArrayList. Would it not be sufficient to just return the following?

Collections.unmodifiableList(Arrays.asList(this.toArray()))

I would like to know, if I write a method that returns a list that it creates like this, would there be any issues if I didn't bother to make a defensive copy of it?


Solution

  • It has a purpose

    I'm wondering why the list returned by Arrays.asList needs to be copied to a new ArrayList. Would it not be sufficient to just return the following?

    Which .toArray() are we talking about? If we're talking about the one from j.u.Collection, you're right: The javadoc of that toArray() method guarantees that it is a newly allocated array, hence, no need to make a defensive copy. However, that's not the toArray() involved here: We're talking about j.u.s.Stream's .toArray(). Its javadoc is a lot shorter:

        /**
         * Returns an array containing the elements of this stream.
         *
         * <p>This is a <a href="package-summary.html#StreamOps">terminal
         * operation</a>.
         *
         * @return an array, whose {@linkplain Class#getComponentType runtime component
         * type} is {@code Object}, containing the elements of this stream
         */
    

    Unlike j.u.Collection toArray(), this javadoc makes no notes either way about the nature of that array. Hence, evidently, the default impl in the Stream javadoc does not presume the array is necessarily 'safe' ('safe' in the sense: Will not later be modified).

    I think you know, but just to be crystal clear:

    Arrays.asList is almost always a mistake. It's a worst-of-both-worlds list: It creates a light wrapper around the array you pass to it. So, set() works (because foo[x] = value; is possible and indeed, is how that is implemented), but .add() doesn't (because you can't change the size of an array). The list is not immutable (i.e. if passing it to code outside of your direct control you need to extensively document, or, make a defensive copy), but also not a fully featured list. Use List.of instead, which makes fully immutable lists - lists you can pass around to whomever you like without worrying about losing control of integrity.

    EDIT: And to explain why its Collections.unmodifiableList(new ArrayList<...) instead of List.copyOf - The immutable lists that those static methods on List make throw NPEs when null is in there, but streams can have null, so you can't use that. This edit brought to you by a comment from @Rob Spoor :)

    This is not as big an issue as you think, though

    Note that the code you found is merely the default implementation - any particular implementation of stream is free to come up with a better way to do this. Just about every implementation you are bound to run into overrides the definition. For example, if you call an arraylist's stream(), you end up with this implementation instead, which is from streamops and is the same code path used by all the collections in the core libs. Hence, the actual code you end up using, is this:

        @Override
        public List<P_OUT> toList() {
            return SharedSecrets.getJavaUtilCollectionAccess()
            .listFromTrustedArrayNullsAllowed(this.toArray());
        }
    

    'trusted array' here refers to the notion that the array passed to it is guaranteed not going to change out from under you. There is no way the compiler can enforce that guarantee, but arraylist's (and indeed, all non-broken collections, given that the javadoc of j.u.Collection demands it!) toArray() makes new arrays so it's safe to do that, and this is far more efficient (avoids a bunch of needless copies).