Search code examples
pythonfunctionnumpyknnnearest-neighbor

Function reads np.array - produces the mean for k nn to number p in np.array


I need to defina a function which reads a numpy array and produces the mean for k nearest points to number p in the array.

Example:

array= np.array([1, 2, 3, 4, 5, 6, 7, 50, 24, 32, 9, 11, 12, 10])
p= 15 (**Note this is not a number in the array, I will need to find the 
number closest to p or p number itself)
k = 3

In this case, I would need to generate the mean for ([11, 12, 10)]
as they are closest to p = 15

With the above numbers, I will need to find the mean for k number of points closest to p and p can be explicitly stated in the array or may not be.

I am new and very confused at this point and feel I have exhausted my resources. I feel this question has been asked before but the answers are much too complex for what I need.

Thanks in advance.


Solution

  • Given a (1d) array arr and scalar input p, here's how you could find the mean of the n nearest values:

    def neighbor_mean(arr, p, n=3):
        idx = np.abs(arr - p).argsort()[:n]
        return arr[idx].mean()
    
    arr = np.array([1, 2, 3, 4, 5, 6, 7, 50, 24, 32, 9, 11, 12, 10])
    neighbor_mean(arr, p=15)
    # 11.0
    

    In the above, first you take the absolute differences:

    np.abs(arr - 15)
    # array([14, 13, 12, 11, 10,  9,  8, 35,  9, 17,  6,  4,  3,  5])
    

    Then argsort() returns the indices that would sort an array. We're interested in the n-smallest absolute differences. This is what you're really looking for, rather than sorting the differences directly.

    np.abs(arr - p).argsort()[:3]
    # array([12, 11, 13])
    

    Lastly you want to index your input array arr and take the mean of this:

    arr[[12, 11, 13]]
    # array([12, 11, 10])  # mean: 11.0