Search code examples
javaarraylistcollections

Why sometimes returning a new List containing the List?


I see sometimes code like this :

public List<Data> retrieveDataFromInventory() {
  return new ArrayList<>(inventory.getListData().stream().filter(...).toList());
}

What's the difference with :

public List<Data> retrieveDataFromInventory() {
  return inventory.getListData().stream().filter(...).toList();
}

Solution

  • returning a new List containing the List

    new ArrayList<>(someList)

    You have misunderstood a few things. This does not create a list containing a list.

    This makes a new arraylist (i.e. it has the performance charactistics of an arraylist and uses arraylist's mechanics: A backing array that is grown as needed), and then fills it with every item in someList. In other words, it makes a copy of someList. Lets say someList is an instance of LinkedList, it doesn't make another LinkedList - it makes an ArrayList containing the same elements that the linked list did at the time you called new ArrayList<>(thatList) - from there on out they can start diverging.

    The difference is: .toList() returns an unmodifiable list. By making a new arraylist as a clone of this list, you then get to return an arraylist that is modifiable.

    The first snippet is highly suspect - why is some intermediate method doing the effort of making that a mutable list? Making that copy can be quite an expensive operation and normally a mutable list isn't necessary - hence, this first method is badly designed. If the caller to retrieveDataFromInventory wants to use that list as a 'scratch space' - it can make its own copy, that's not worth making a second method for.

    A few possible explanations:

    Misunderstanding of defensive copying

    When you have a mutable object that represents internal state (i.e. state that your code needs to control - i.e. state that you don't want to just change out from under your feet because some other code in perhaps even some other thread is messing with it directly), you don't want to hand refs to those objects out in public methods. Instead, you'd make a copy so that any messing is done to a copy and not to your internal state. However, this is not relevant here - stream already makes a copy. There is no need to copy the copy. The author misunderstood the principle of making defensive copies and has misapplied the rule here.

    Following older API design

    It is possible that the author made this method in the distant past, without streams; for example like this:

    var out = new ArrayList<Data>();
    for (var data : inventory.getListData()) {
      if (!.... (filter here)....) out.add(data);
    }
    return out;
    

    which does the same thing as the methods you pasted, except, that one really does return a mutable list specifically, and if the docs of this method indicate that this is something callers get to rely on (the API docs spell out: Hey, that list is mutable), then this rewrite is a bad idea and should be unwound - now you're making a second copy for the highly dubious benefit of avoiding some braces. This is very much not worth it - either one is equally readable.

    However, shiny new tools and all that - a few folks have gone far too far in refactoring their code and gone a little lambda crazy.

    Following fallout of an older API implementation detail

    What's more likely is the exact same situation as above, except, the javadoc do not spell out that the list can be messed with, but users of this API have assumed it; replacing it with the .stream().toList() code breaks users of the library. The right move is tricky - however, generally, it is a slightly better idea to just 'break' - you never promised that the list would be mutable, so any code that mutates it is at fault here, and you should just update this code and then let the maintainers that made the bad assumption fix it when they get around to it.

    This principle is a complicated concept, and has a name: Hyrum's Law. The correct choice (bend over backwards to support "API abusers", or, cause downstream users of your code to get bugs simply because they upgraded their dependency on your code) depends on the situation and is beyond the scope of Stack Overflow (because, it's opinions and not objectively decidable except on a case-by-case basis).