I have collected data from a web scraper and want to make a line graph out of it. In my list of points([[1,4],[2,3],[3,8]...]), there are points that overlap each other on 'x', but have a different values on 'y'. These should be combined into one (average).
[[2,3],[5,2],[3,4],[5,4]...] ----------> [[2,3],[5,3],[3,4]...]
Is there a more efficient way to do that, than a loop?
You could only loop through these, but we can be pythonic about it. Here's a solution I came up with:
from itertools import groupby
from operator import itemgetter
from statistics import mean
inp = [[2,3],[5,2],[3,4],[5,4]]
points = [(x, mean(map(itemgetter(1), g))) for x, g in groupby(sorted(inp, key=itemgetter(0)), key=itemgetter(0))]
print(points) # [(2, 3), (3, 4), (5, 3)]
We can break this list comprehension down to the following equivalent code:
points = []
inp.sort(key=itemgetter(0)) # Sort results by 'x' value (for groupby)
for x, g in groupby(inp, key=itemgetter(0)): # Iterate through all grouped x values
y = map(itemgetter(1), g) # Extract all the 'y' values into a list
avg_y = mean(y) # Get the statistical mean of all the 'y'
points.append((x, avg_y)) # Add this x,y-coordinate to the result set