Question is, how to clusterize pairs of some units by their angle? Problem is that, kmeans operates on the notion of Euclidean space distance and does not know about periodic nature of angles. So to make it work, one needs to translate the angle to Euclidean space but hold the following true:
Which means, that, 90 and -90 are distant values, 180 and -180 is the same, 170 and -170 are close (angles come from left up and to right: 0 - +180 and from left down to the right: 0 - -180)
I tried to use various sin()
functions but they all have issues mentioned in points 1 and 2. Most perspective one is sin(x * 0.5f)
but also having the problem that 180 and -180 are distant values in Euclidean space.
The solution I found is to translate angles to points on circle and feed them into kmeans. This way we make it to compare distances between points and this works perfectly.
Important thing to mention. Kmeans @eps
in termination criterion is expressed in terms of units of samples that you feed to kmeans. In our example maximal distant points have dist 200 units (2 * radius). This means that having 1.0f is totally fine. If you use cv::normalize(samples, samples, 0.0f, 1.0f)
for your samples before calling kmeans()
, adjust your @eps
appropriately. Something like eps=0.01f
plays better here.
Enjoy! Hope this helps someone.
static cv::Point2f angleToPointOnCircle(float angle, float radius, cv::Point2f origin /* center */)
{
float x = radius * cosf(angle * M_PI / 180.0f) + origin.x;
float y = radius * sinf(angle * M_PI / 180.0f) + origin.y;
return cv::Point2f(x, y);
}
static std::vector<std::pair<size_t, int> > biggestKmeansGroup(const std::vector<int> &labels, int count)
{
std::vector<std::pair<size_t, int> > indices;
std::map<int, size_t> l2cm;
for (int i = 0; i < labels.size(); ++i)
l2cm[labels[i]]++;
std::vector<std::pair<size_t, int> > c2lm;
for (std::map<int, size_t>::iterator it = l2cm.begin(); it != l2cm.end(); it++)
c2lm.push_back(std::make_pair(it->second, it->first)); // count, group
std::sort(c2lm.begin(), c2lm.end(), cmp_pair_first_reverse);
for (int i = 0; i < c2lm.size() && count-- > 0; i++)
indices.push_back(c2lm[i]);
return indices;
}
static void sortByAngle(std::vector<boost::shared_ptr<Pair> > &group,
std::vector<boost::shared_ptr<Pair> > &result)
{
std::vector<int> labels;
cv::Mat samples;
/* Radius is not so important here. */
for (int i = 0; i < group.size(); i++)
samples.push_back(angleToPointOnCircle(group[i]->angle, 100, cv::Point2f(0, 0)));
/* 90 degrees per group. May be less if you need it. */
static int PAIR_MAX_FINE_GROUPS = 4;
int groupNr = std::max(std::min((int)group.size(), PAIR_MAX_FINE_GROUPS), 1);
assert(group.size() >= groupNr);
cv::kmeans(samples.reshape(1, (int)group.size()), groupNr, labels,
cvTermCriteria(CV_TERMCRIT_EPS/* | CV_TERMCRIT_ITER*/, 30, 1.0f),
100, cv::KMEANS_RANDOM_CENTERS);
std::vector<std::pair<size_t, int> > biggest = biggestKmeansGroup(labels, groupNr);
for (int g = 0; g < biggest.size(); g++) {
for (int i = 0; i < group.size(); i++) {
if (labels[i] == biggest[g].second)
result.push_back(group[i]);
}
}
}