Search code examples
pythonmultidimensional-arraygraphaveragepoints

Combining Points for a Graph in an Array?


I have collected data from a web scraper and want to make a line graph out of it. In my list of points([[1,4],[2,3],[3,8]...]), there are points that overlap each other on 'x', but have a different values on 'y'. These should be combined into one (average).

[[2,3],[5,2],[3,4],[5,4]...] ----------> [[2,3],[5,3],[3,4]...]

Is there a more efficient way to do that, than a loop?


Solution

  • You could only loop through these, but we can be pythonic about it. Here's a solution I came up with:

    from itertools import groupby
    from operator import itemgetter
    from statistics import mean
    inp = [[2,3],[5,2],[3,4],[5,4]]
    
    points = [(x, mean(map(itemgetter(1), g))) for x, g in groupby(sorted(inp, key=itemgetter(0)), key=itemgetter(0))]
    print(points)  # [(2, 3), (3, 4), (5, 3)]
    

    We can break this list comprehension down to the following equivalent code:

    points = []
    inp.sort(key=itemgetter(0))                   # Sort results by 'x' value (for groupby)
    for x, g in groupby(inp, key=itemgetter(0)):  # Iterate through all grouped x values
        y = map(itemgetter(1), g)                 # Extract all the 'y' values into a list
        avg_y = mean(y)                           # Get the statistical mean of all the 'y'
        points.append((x, avg_y))                 # Add this x,y-coordinate to the result set