I have an interesting python "network" challenge. Not sure what data construct to use and am hoping to learn something new from the experts!
The following is a snippet from an ordered table (see SORT
column) with a FROM NODE
and TO NODE
. This table represents a hydrologic network (watershed sequencing) from upstream to downstream. TO NODE = 0
indicates the bottom of the network.
Given there can be many branches, how does one traverse a network from any given FROM NODE
to TO NODE = 0
and store the list of nodes that are sequenced over?
For example, starting at:
HYBAS_ID (this is same as FROM NODE)= 2120523430
see <----
about 2/3rd way down, the sequence of Nodes would include the following:
[2120523430, 2120523020, 2120523290, 2120523280,2120523270,2120020790, 0]
Table:
SORT FROM NODE TO NODE
30534 2121173490 2120522070
30533 2120521930 2120521630
30532 2120522070 2120521630
30531 2120521620 2120522930
30530 2120521630 2120522930
30529 2121172200 2121173080
30528 2120522930 2120523360
30527 2120522940 2120523360
30526 2120519380 2120523170
30525 2120519520 2120523170
30524 2120523440 2120523350
30523 2120523360 2120523350
30522 2120525750 2120523430
30521 2120525490 2120523430
30520 2121173080 2120522820
30519 2120523430 2120523020 <------- if you start network here
30518 2120523350 2120523020
30517 2120522820 2120523290
30516 2120523020 2120523290 <------ the second node is here
30515 2120523170 2120523280
30514 2120523290 2120523280 <------ third node here
30513 2120523160 2120523270
30512 2120523280 2120523270 <------ fourth
30511 2120523150 2120020790
30510 2120523270 2120020790 <------ fifth
30509 2120020790 0 <------ End.
Would a dictionary and some kind of graph structure work to traverse this network? The code may be fed a list of thousands or millions of FROM NODES
to calculate the routing so efficiency is important.
The standard way to do this is with an unordered dictionary representing a directed graph, and then to iterate through it. First you have to populate the dictionary (which let's say is a csv for the sake of argument), and then iterate through the values. so
my_graph = {}
with open("myfile.csv") as f:
for l in f:
split_lines = l.split(",")
my_graph[int(split_lines[1])] = int(split_lines[2])
HYBAS_ID = 2120523430
current_hybas = my_graph[HYBAS_ID]
return_value = [HYBAS_ID]
while current_hybas != 0:
return_value.append(current_hybas)
current_hybas = my_graph[current_hybas]
print return_value
Should approximately get you what you want.