Search code examples
pythonlistradix-treecapnproto

Issue with python list for complex types


Below is a code snippet in Python that stores the IP prefixes in a radix tree and then associates IP and ASNs in a dictionary, if the IP belongs to a prefix.

I would like to find out all different ASNs for a particular prefix. More details are provided below:

#rtree is a radix tree which has prefixes stored.
rtree = radix.Radix()    
with open(path-to-prefix-file,'r') as fp:
    for line in fp:
        rnode = rtree.add(line)  # Eg, Prefixes like "192.168.2.0/24"
        rnode.data["count"]= 0
...        
# The code has several lines here for processing a capnproto - skipping them.

rnode.data[IP]=asn_complete  # I read a Capnproto buffer and store IP and asn_complete

...

for rnode in rtree:
    seen_list = []  # defining a list to store different val, i.e.,asn_complete values
    if rnode.data["count"] > 1:
            """  Iterate through the rnode.data dictionary """
            for ip,val in rnode.data.iteritems():
                    if val not in seen_list:  # Condition is always satisfied!!
                            seen_list.append(val) 

For eg: val has the following value from the protobuf in several iterations:

[<capnp list reader [15169]>, <capnp list reader [1239]>, <capnp list reader [4837]>]

When I print out the seen_list:

[[<capnp list reader [15169]>, <capnp list reader [1239]>, <capnp list reader [4837]>], [<capnp list reader [15169]>, <capnp list reader [1239]>, <capnp list reader [4837]>], [<capnp list reader [15169]>, <capnp list reader [1239]>, <capnp list reader [4837]>],....]

Clearly val is in seen_list; but, if val not in seen_list: is always true and val gets appended to seen_list so many times. I don't understand why the condition always returns true. Is it because of the type of object stored in seen_list?


Solution

  • At present, Cap'n Proto readers do not support any sort of "equality" comparison. Partly this is because it's unclear what equality should mean: should it be by identity (two readers are equal if they point at exactly the same object) or should it be by value (they are equal if they point at objects with equivalent content)?

    In any case, in requires an implementation of __eq__ to test for equality, and in the case of Cap'n Proto there is no such implementation. Probably, what ends up happening is that Python is comparing the wrapper objects by identity -- and as new wrapper objects keep being created, these comparisons are always false.

    In order to get what you want, you'll probably need to convert the Cap'n Proto objects fully into plain Python objects which are properly comparable.