I have two lists: 1 is a depth list and the other is a chlorophyll list, which correspond to each other. I want to average chlorophyll data every 0.5 m depth.
chl = [0.4,0.1,0.04,0.05,0.4,0.2,0.6,0.09,0.23,0.43,0.65,0.22,0.12,0.2,0.33]
depth = [0.1,0.3,0.31,0.44,0.49,1.1,1.145,1.33,1.49,1.53,1.67,1.79,1.87,2.1,2.3]
The depth bins are not always equal in length and do not always start at 0.0 or 0.5 intervals. The chlorophyll data always coordinates with depth data though. The chlorophyll averages also cannot be arranged in ascending order, they need to stay in correct order according to depth. The depth and chlorophyll lists are very long, so I can't do this individually.
How would I make 0.5 m depth bins with averaged chlorophyll data in them?
Goal:
depth = [0.5,1.0,1.5,2.0,2.5]
chlorophyll = [avg1,avg2,avg3,avg4,avg5]
For example:
avg1 = np.mean(0.4,0.1,0.04,0.05,0.4)
One way is to use numpy.digitize
to bin your categories.
Then use a dictionary or list comprehension to calculate results.
import numpy as np
chl = np.array([0.4,0.1,0.04,0.05,0.4,0.2,0.6,0.09,0.23,0.43,0.65,0.22,0.12,0.2,0.33])
depth = np.array([0.1,0.3,0.31,0.44,0.49,1.1,1.145,1.33,1.49,1.53,1.67,1.79,1.87,2.1,2.3])
bins = np.array([0,0.5,1.0,1.5,2.0,2.5])
A = np.vstack((np.digitize(depth, bins), chl)).T
res = {bins[int(i)]: np.mean(A[A[:, 0] == i, 1]) for i in np.unique(A[:, 0])}
# {0.5: 0.198, 1.5: 0.28, 2.0: 0.355, 2.5: 0.265}
Or for the precise format you are after:
res_lst = [np.mean(A[A[:, 0] == i, 1]) for i in range(len(bins))]
# [nan, 0.198, nan, 0.28, 0.355, 0.265]