Search code examples
javaperformancelistdictionaryset

List vs. Map: Which takes less space and more efficient?


I have two classes SensorNetwork and Sensor.

class SensorNetwork
{
    Set<Integer> sensorIDs; // Sensor networks have a collection of sensors.
    Set<Integer> adjacents; // Adjacency list of SensorNetwork objects.
}

class Sensor
{
    int networkID; // ID of the network of which this sensor belongs to
    Map<Integer, Float> adjacents; // Adjacency list of sensors in form of <ID, distance>
}

Number of SensorNetworks are predefined (up to 1000). Hence, I may use an array. But number of Sensors are undefined (at most number of sensor networks / 4).

When you consider addition, deletion and get(), I need the one which is faster and takes less space (because I'm going to use serialization).

Here are my options (as far as I have thought)

Option 1: Don't define a class for SensorNetwork. Instead, use List<Set<Integer>> sensorNetwork; and another map for Map<Integer, Set> networkAdjacencies;
Option 2: Use Map<Integer, Set<Integer>> sensorNetwork. In case I want to get sensors of network i, I simply write sensorNetwork.get(i).
Option 3: Don't define classes. Instead, use option 2 and for Sensor class:

Map<Integer, Map<Integer, Float>> sensorAdjacencies;

Which option should I choose in terms of space and time efficiency?


Solution

  • This sounds like it'd be very helpful for you (specifically the Data Structures section): http://bigocheatsheet.com/

    You say

    I need my structure to be efficient while adding, removing and finding elements. No other behavior.

    The problem is that Lists and Maps are usually used in totally different cases. Their names describe their use cases fairly well -- you use a List if you need to list something (probably in some sequential order), while a Map would be used if you need to map an input to an output. You can use a Map as a List by mapping Integers to your elements, but that's overcomplicating things a bit. However, even within List and Map you can have different implementations that differ wildly in asymptotic performance.

    With few exceptions, data structures will take O(n) space, which makes sense. If memory serves, anything other than an ArrayList (or other collections backed only by a primitive array) will have a decent amount of space overhead as they use other objects (e.g. Nodes for LinkedLists and Entry objects for Maps) to organize the underlying structure. I wouldn't worry too much about this overhead though unless space really is at a premium.

    For best-performance addition, deletion, and search, you want to look at how the data structure is implemented.

    • LinkedList-style implementation will net you O(1) addition and deletion (and with a good constant factor, too!), but will have a pretty expensive get() with O(n) time, because the list will have to be traversed every time you want to get something. Java's LinkedList implementation, though, removes in O(n) time; while the actual act of deletion is O(1), that's only if you have a reference to the actual node that you're removing. Because you don't, removals in Java's LinkedList are O(n) -- O(n) for searching for the node to remove, and O(1) for removal.

    • Data structures backed with a plain array will have O(1) get() because it's an array, but takes O(n) to add, and delete, because any addition/deletion other than at the last element requires all other elements to be shuffled (in Java's implementation at least). Searching for something using an object instead of an index is done in O(n) time because you have to iterate over the array to find the object.

    The following two structures are usually Maps, and so usually require you to implement equals() (and hashCode() for HashMaps):

    • Data structures backed by a tree (e.g. TreeMap) will have amortized (I think) O(lg n) add/remove, as a good implementation should be self-balancing, making worst-case addition/deletions only have to go through the height of the tree at most. get() operations are O(lg n). Using a tree requires that your elements be sortable/comparable in some way, which could be a bonus or hinderance, depending on your usage.

    • Hash-based data structures have amortized (average) O(1) everything, albeit with a slightly higher constant factor due to the overhead of hashing (and following any chains if the hash spread is poor). HashMaps could start sucking if you write a bad hashCode() function, though, so you want to be careful with that, although the implementers of Java's HashMap did do some magic behind the scenes to try to at least partially negate the effect of bad hashCode() implementations.

    Hope that rundown helped. If you clear up how your program is structured, I might be able to give a recommendation. Until then, the best I can do is show you the options and let you pick.