Search code examples
javaarraylistbigintegerlarge-databigdata

Loading big data sets to ArrayList in Java (max capacity of ArrayList)


I am trying to load a data sets of more than 2^32 elements, and put those elements in an ArrayList anArrayList. This data is in chronological order, so I use ArrayList to store the data to keep the order. At the same time, I want to get quick access to the elements from an String elementID. Now I am using HashMap to map a elementID to the element Object in anArrayList. I used an integer currentAddingAt to keep tracking the index on anArrayList to adding the element. Here is the related code:

ArrayList<ElementX> anArrayList;
int currentAddingAt;
HashMap<String, ElementX> elementToObjHashMap;

... ...

public void addAnElement(ElementX e){
    anArrayList.add(currentAddingAt, e);
    elementToObjHashMap.put(e.getElementID, ArrayList.get(currentAddingAt));  
}

A problem came when I changed the type of currentAddingAt from int to long. Because ArrayList's get(int index) method only takes int as an argument, according to Oracle's documentation(http://docs.oracle.com/javase/7/docs/api/java/util/ArrayList.html). This also makes me wondering:

Can ArrayList's capacity larger than the largest int number in Java(2^32)?

What are the options other than using ArrayList and HashMap in this case (to keep the order of a large data set and still have ability to get quick mapping from a key to the object)? Do I need some library (or even some framework) other then plain Java?


Solution

  • Can ArrayList's capacity larger than the largest int number in Java(2^32)?

    No. Because it's array-backed, it can't be larger than 2^31-1. This applies to all Collections if you want the size() and toArray() methods to work.

    You'll need to store lists of lists, but I bet there's a library that can do it. I haven't used that part of it, but Fastutil has big data structures in addition to its primitive data structures.