There are the following 2 arrays with equal length. My goal is to split the array B into groups defined by the array A. So finally there should be 3 arrays or an list of array. The final list of arrays should consists of the following rows of array B:
The order is not really relevant.
A = array([[-1],
[ 1],
[ 0],
[ 0],
[ 1]])
B = array([[ 624.5 , 548. ],
[ 912.8201, 564.3444],
[1564.5 , 764. ],
[1463.4163, 785.9251],
[1698.0757, 846.6306]])
The problem occured to me by using the dbscan clustering function. The A array describes the clusters (0, 1) of the points in array B. The values -1 declares the point as outlier. (The values used are not precise). My goal is to calculate the compactness, ... of each found cluster
The numpy_indexed package (disclaimer: i am its author) was designed with these type of use cases in mind.
import numpy_indexed as npi
C = npi.group_by(A).split(B)
Not sure what you mean by compactness of each group; but rather than splitting and doing subsequent computations, it is typically more efficient to compute reductions over groups directly; whereby you can reuse the grouping object for increased efficiency:
groups = npi.group_by(A)
mean = groups.mean(B)
std = groups.std(B)