Search code examples
algorithmmedian

Understanding Cormen post office location solution


The book "Introduction to algorithms" by Cormen has a question post office location problem in chap 9.

We are given n points p1,p2,...pn with weights w1,w2,....wn. Find a point p(not necessarily one of the input points) that minimizes the sum wi*d(pi,p) where d(a,b) = distance between points a,b.

Looking at the solution to the same , I understand that the weighed median would be the best solution for this problem.

But I have some fundamental doubts about the actual coding part and the usage.

  • If all elements have equal weight , then to find the weighed median, we find the point till which summation of all weights < 1/2. How to extend it here ?

  • Given a real scenario having say the number of letters to be delivered at various houses as the weights and we want to minimize the distance to be traveled by finding the location of the post office, x coordinates given ( assuming all houses are in 1 single dimension) , how would we actually go about it ?

Could someone help me in clearing my doubts and understanding the problem.

EDIT :

I was also thinking about a very similar problem : There is a rectangular grid(2d) and different number of people at various places and all want to meet at 1 point (should definitely have integer coordinates) , then what difference would be there from the above problem and how would we solve it ?


Solution

  • You still want the point at which the weights sum to 1/2. Pick any point and consider whether you would do better moving one point to the left or one point to the right from there. If you move left one point you reduce the distance to all points on the left by one and increase the distance to all points on the right by one. Whether you win or lose by this depends on the sum of the weights to the left and the sum of the weights to the right. If the weights do not sum to 1/2 you can do better by moving in the direction that has weight > 1/2, so the only point where you can't do better by choosing another one is the point with weight 1/2 - or, to be more accurate, the point where the weights on either side are both <= 1/2.

    For 1/2 to be the right answer the weights have to sum to 1, so if you start off with weights which are numbers of letters then you have to divide them by the total number of letters to get them to sum by one. Of course this penalty function doesn't really make sense unless you have to make a separate trip for each letter to be delivered, but I'm assuming that we are supposed to ignore that.

    EDIT

    For more than one dimension, you pretty much end up solving the problem of minimising the weighted sum of distances directly. Wikipedia describes this in http://en.wikipedia.org/wiki/Geometric_median. You want to take weights into account, but that doesn't complicate the problem that much. One way of doing it is http://en.wikipedia.org/wiki/Iteratively_reweighted_least_squares. Unfortunately, that doesn't guarantee that the solution it finds will be on a grid point, or that the nearest grid point to a solution will be the best grid point. It probably won't be too bad, but finding the very best grid point in all possible cases might be trickier.