PySpark - reducyByKey on a (tuple,int) value

I have the RDD:

I want to reduce it to each key and its average per point in the tuple. For example if I have (small example):

[(1,((0,19,15,39),1)),(1,((0,64,19,3),1))]

I will get:

[(1,(0,83,34,41),2))]

then (or directly)

[(1,(0,41.5,17,21)]

I tried:

reduceByKey(lambda a,b: a+b)
reduceByKey(lambda a,b: (a[0]+b[0],a[1]+b[1]))

and other stuff that didn't helped or gave RDD error.

How can I solve the issue?

Solution

You need to do some further calculations to get the average per key:

result = rdd.reduceByKey(lambda a, b: (tuple(i+j for (i,j) in zip(a[0],b[0])), a[1]+b[1])) \
            .map(lambda r: (r[0], tuple(i/r[1][1] for i in r[1][0])))

Change value in Ini to empty using Python configparser
Script works with error line but won't run corectly when bad line is removed
FFMPEG not saving logs when converting to audio format
Numerically obtaining response of a damped driven oscillator gives peak at wrong frequency
What is the deal with the pony in Python community?
'Area' object is not callable
ValueError: Attribute Users.request is required
How to place class in its own file when it appears to be inheriting from an instance?
Python 3.6 with pony 0.7 gives error on commit to oracle db
Why pyqt tablewigdet is only displaying row number and no data?
How to check if a value exists in a dictionary?
OpenCV: draw a rectangle around a region
Python, Tkinter, trying to pull random numbers from a list based off user input for number and have results open in mew window
How to remove repeated elements in a vector, similar to 'set' in Python
Understanding descriptor protocol for 'wrapper-descriptor' itself
Passing a NumPy 3d array to a C function with a triple pointer as an argument
How do I create variable variables?
Turn a tf.data.Dataset to a jax.numpy iterator
Django 5.1 + Postgresql (debian server)
Passing in arguments to Dependency function makes it recognized as a Query Parameter
How to pass parameters to an endpoint using `add_route()` in FastAPI?
How to make Depends optional in FastAPI?
Streaming multiple videos through FastAPI to Web Browser causes HTTP requests to stall
FastAPI error when handling file together with form-data defined in a Pydantic model
Use serial port in Python without installing external packages
How can i get the output to print 10 circles instead of 9 in python turtle module
Unable to install TA-Lib on Ubuntu
Unable to install Python without sudo access
Size of an open file object
two if or more with one ELSE