Search code examples
algorithmgraphtime-complexitynetworkx

Which is the best existing algortihm for finding the node(s), in a network graph, with maximum degree?


Problem:

Given a graph, find the node (s) with the maximum degree and return it/them in a list.

My solution:

def max_degree(graph):
        """
        :param graph: non-null and non-oriented networkx type graph to analyze.
        :return: list of nodes of this graph with maximum degree; 0 if the graph has no nodes; -1 if the graph is disconnected;
        """
        if len(graph.nodes) == 0:
            return 0
        if not nx.is_connected(graph):
            return -1
        # node_grade_list -> [(<node>, <grade>), ...]
        node_grade_list = sorted(grafo.degree, key=lambda x: x[1], reverse=True)
        max_grade = node_grade_list[0][1]
        max_grade_nodes_list = []
        for node in node_grade_list:
            if node[1] == max_grade:
                max_grade_nodes_list.append(node[0])
        return max_grade_nodes_list

Is there a way to make it much more efficient?


Solution

  • Assuming that graph.degree is a list of tuples

    [(<node>, <grade>), ...]
    

    Replace getting max nodes with

    max_grade = max(graph.degree, key=lambda x: x[1])[1]
    max_grade_nodes_list = [node[0] for node in graph.degree if node[1] == max_grade]
    

    This is faster since this is O(n), while sorted requires O(n*log(n))

    Revised code

    def max_degree(graph):
        """
        :param graph: non-null and non-oriented networkx type graph to analyze.
        :return: list of nodes of this graph with maximum degree; 0 if the graph has no nodes; -1 if the graph is disconnected;
        """
        if len(graph.nodes) == 0:
            return 0
        if not nx.is_connected(graph):
            return -1
        # graph.degree -> [(<node>, <grade>), ...]
        max_grade = max(graph.degree, key=lambda x: x[1])[1]
    
        return [node[0] for node in graph.degree if node[1] == max_grade]
    

    Explanation of Why Max is Faster

    Time complexity is a concept in computer science that deals with the quantification of the amount of time taken by a set of code or algorithm to process or run as a function of the amount of input.

    Both max and sorted are builtin functions so both are natively coded.

    Function max is linear in time O(n), since we only have to traverse the list once to find a max. This means as we double the size n of a list, the time it takes to find the max doubles.

    Python sort uses TimSort which has an average time complexity of O(n*log(n)), where n is the length of the input list.

    O(n*log(n)) average complexity is typical for many sorting algorithms such as quicksort, mergesort, etc.

    Comparing O(n*log(n)) to O(n) we see that Max has a speed advantage of:

    O(n*log(n))/log(n)  = O(log(n)).
    

    Meaning for a list of 1024 elements, n = 2^10 it will be log(2^10) = 10 times faster

    For smaller list, for instance n = 10 it won't really matter which method you use.