I am currently writing my master's thesis. It's about graphs. My algorithm is ready. But now I have to think about useful data structures to represent the graph and the rest that I need for a good runtime. I am not allowed to use the adjacency matrix because of the large amount of memory. Since I have to check in every iteration whether a certain edge exists, adjacency lists also make no sense.
First I thought about two hash tables nested inside one another. All nodes are stored in the first table and all neighboring nodes in the second. But since I have to be able to choose a random neighbor in my algorithm, that is not optimal either. In addition, I have to be able to save edge weights in every iteration of the algorithm.
A list with all the basic operations:
I am allowed to use different data structures for all operations. Unfortunately, I'm running out of ideas.
I hope someone here can help me.
Thanks in advance!
Lisa
I would suggest the following:
A map data structure where the keys are implemented as a hashtable to give you O(1) access/insert/delete time. The keys of this map will be your nodes.
The values of this map data structure will be sets, also implemented as hashtables, to give you O(1) access/insert/delete time. The members of these sets are neighboring nodes to the key.
For the weights of the edges you have a few options. You can insert tuples into the sets of the the adjacency list where say the first value is the neighboring node and the second value is the weight of the edge connecting those two nodes. The weight can be updated in O(1) time since both the map and the sets are implemented as hashtables.
You can store the degrees of the nodes in a separate map data structure where the keys (implemented as hashtable) are the nodes and the values are the degrees. This will provide O(1) access/delete/update time to each node's degree.
To achieve O(1) time for random edges connected to a node you need to use an additional data structure. The reason is that the edges stored in the sets (as hashtables), won't give us a way to get a random number in O(1). But if we store each node's edges in a list/array like data structure, we can get a random index between 0 and the length of that array in O(1) and access that edge in O(1).
This method, will cover your time complexity requirements by adding to the space needed, although the space complexity does not increase.
Instead of separate data structures you could create classes that combine them in one place.
I can add simplified implementations if necessary.