Search code examples
javaperformancearraylisthashmap

HashMap with consecutive Integers as keys vs. ArrayList


I stumbled upon some code that uses HashMap<Integer,HashMap<Integer, String>> to store a large Excel table with typically >100.000 rows and 10 columns. Both columns and rows are consecutive integers. I would have definitely used something like a ArrayList<ArrayList<String>> instead.

So could there be any significant advantages of using the HashMap? Do you see any serious performance issues (both memory an runtime wise)?


Solution

  • I feel to compelled to contribute an answer but credit should go to @Kayaman who saw the obvious and commented first.

    The significant potential advantage you are looking for is speed, flexibility and space-saving in the general case.

    Consider that you have a 3x3 range of 9 cells with top-left corner at cell A1 and then add a single new cell at ZZ49. With any data structure that uses linear allocation of memory you suddenly need to grow memory by several orders of magnitude (~30,000 new empty cells), possibly (depending on data structure) rearranging already-stored cells and initialising a large number of never-to-be-used nulls (language/library implementations will have varying implementation details but similar wasteful disadvantages).

    Would Excel itself keep arrays/arraylists sized to cover every cell in every worksheet/range? Unlikely. In your example the range of cells may not be sparsely populated but in principle and in general the number of cells that contain data in a spreadsheet is a tiny fraction of the maximum rectangular area allowed. A hashmap (or "multidimensional hashmapping") is therefore not an unreasonable choice of data structure in a general approach to such a map-to-Excel problem, although you might have good reason for a different choice in your specific application.