Search code examples
javalistarraylistcapacity

What is the benefit of setting the capacity of an ArrayList explicitly


In java ArrayList we have a constructor -

ArrayList(int capacity)

and two methods -

void ensureCapacity(int minCapacity)

void trimToSize()

Consider a code-sample:

ArrayList<String> arrayList3 = new ArrayList<>(5);
System.out.println(arrayList3.size());

arrayList3.add("Zebra");
arrayList3.add("Giraffe");
arrayList3.add("Bison");

System.out.println(arrayList3);
System.out.println(arrayList3.size());

arrayList3.add("Rhino");
arrayList3.add("Hippo");
arrayList3.add("Elephant");
arrayList3.add("Antelope");

System.out.println(arrayList3);
System.out.println(arrayList3.size());

Output:

0
[Zebra, Giraffe, Bison]
3
[Zebra, Giraffe, Bison, Rhino, Hippo, Elephant, Antelope]
7

Here, I fail to see how setting an initial capacity affects the execution of the program. ArrayList is a flexible list that changes size as per demand. So what is the significance of setting the capacity explicitly?

And in the case if I want to set the capacity explicitly, is there any method to view the current capacity? Since int size() clearly is not applicable here.


Solution

  • ArrayList as an implementation of a Dynamic array data structure.

    It resizes when its underlying array gets full (i.e. the current list index exceeds the last valid index of the underlying array).

    When it happens, method add() (or addAll) internally will invoke the method grow(). Which will double the capacity. I.e. it will create a new array with the length two times bigger than the previous length plus a number of new elements that don't fit into the current size.

    The growth has a cost of O(n) because all previously added elements need to be copied into the new array.

    Reminder: when resizing isn't required, a new element will be added in constant time O(1).

    No-argument constructor creates an ArrayList with capacity of 10.

    If you expect that a newly created ArrayList would eventually contain let's say 50,000 elements, it makes sense to use an overloaded constructor to provide the initial capacity of 50,000 in order to improve performance by avoiding unnecessary resizing.

    Also, for that you can use method ensureCapacity() which accessible in the ArrayList class (not in the List interface, because the notion of capacity isn't applicable to LinkedList which isn't backed by an array).

    is there any method to view the current capacity

    No, there isn't. That's called encapsulation. ArrayList, StringBuilder, HashMap, etc. are backed by a plain array, but they will not allow interacting with their underlying array directly.

    But if you have a case when array initially increases size and then a lot of elements are being removed, and you want to release unoccupied heap space, you can use method trimToSize():

    Trims the capacity of this ArrayList instance to be the list's current size. An application can use this operation to minimize the storage of an ArrayList instance.

    But it has to be used with caution, because it can lead to cyclic growth and trimming, which will cause performance degradation.

    Note that there's no need to worry about the amount of unoccupied space if the list is moderate in size, or if you are not expecting let's say 80% of the data to be removed in one go. I.e. even if the list is huge but 50% of its elements gets removed, and you apply trimToSize() on it, it'll restore its previous capacity with the next added element - that's the scenario of continuously growing and shrinking list which will perform badly.

    As a possible option, if you heave a case when most of the data can be removed from a list, instead of using trimToSize() you can filter out the elements that have to be preserved, place them into a new list and dereference the previous one.