Search code examples
pythonjsondictionarynetworkxjsonlines

Is there a better way to get (key,itemN) tuples for python dictionary that has a list as a value?


I have a file of jsonlines that contains items with node as the key and as a value a list of the other nodes it is connected to. To add the edges to a networkx graph, -I think- requires tuples of the form(u,v). I wrote a naive solution for this but I feel it might be a bit slow for big enough jsonl files does anyone got a better, more pythonic solution to suggest?

dol = [{0: [1,2,3,4,5,6]},{1: [0,2,3,4,5,6]}]
for node in dol:
    #print(node)
    tpls = []
    key = list(node.keys())[0]
    tpls = [(key,v) for v in node[key]]
    print(tpls)

<iterate through each one in the list to add them to the graph>

[(0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6)]
[(1, 0), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6)]

Solution

  • Only one key

    If the dict never have more than one item, you can do this:

    dol = [{0: [1, 2, 3, 4, 5, 6]}, {1: [0, 2, 3, 4, 5, 6]}]
    
    for node in dol:
        local_node = node.copy()  # only if dict shouldn't be modified in any way
        k, values = local_node.popitem()
        print([(k, value) for value in values])
    # [(0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6)]
    # [(1, 0), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6)]
    

    Multiple keys

    But if a dict may contains more than one value, you can do a while loop and test if the dict is not empty:

    for node in dol:
        local_node = node.copy()  # only if dict shouldn't be modified in any way
        while local_node:
            k, values = local_node.popitem()
            print([(k, value) for value in values])
    # [(0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6)]
    # [(2, 0), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6)]
    # [(1, 0), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6)]
    

    Of course, if you need to store the generated list, append it to a list instead of just printing it.

    Only one big dictionary

    If your dol object can be a single dictionary, it's even simpler and if, as Yves Daoust said, you need an adjacency list or matrix, here is two example:

    Adjacency list pure python

    An adjacency list:

    dol = {0: [1, 2, 3, 4, 5, 6],
           1: [0, 2, 3, 4, 5, 6]}
    
    adjacency_list = [(key, value) for key, values in dol.items() for value in values]
    print(adjacency_list)
    # [(0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6), (1, 0), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6)]
    

    Adjacency matrix with pandas

    An adjacency_matrix:

    import pandas
    dol = {0: [1, 2, 3, 4, 5, 6],
           1: [0, 2, 3, 4, 5, 6]}
    
    adjacency_list = [(key, value) for key, values in dol.items() for value in values]
    adjacency_df = pandas.DataFrame(adjacency_list)
    adjacency_matrix = pandas.crosstab(adjacency_df[0], adjacency_df[1],
                                       rownames=['keys'], colnames=['values'])
    print(adjacency_matrix)
    # values  0  1  2  3  4  5  6
    # keys                       
    # 0       0  1  1  1  1  1  1
    # 1       1  0  1  1  1  1  1