Search code examples
crecursioncoordinatesmergesort

Recursive merge sort - sorts coordinates in ascending order from an origin


I have an assignment where I need to implement merge sort to sort some coordinates in ascending order from the origin. I have my program's coordinates as follows (cannot edit this; it is required for the assignment):

// stores the origin (does not change).
struct origin
{
  int x_naught;
  int y_naught;
};

// Individual coordinate to be sorted. 
struct coordinate
{
  int x;
  int y;
};

I have an unsorted array of struct coordinates: struct coordinate **coordArray; that holds n amount of coordinate struct pointers.

My question is, what is the most efficient way to sort these coordinates in ascending order from the origin using merge sort? I originally planned to have helper function that declares a separate array and stores the magnitude (distance formula) of each coordinate in the coordArray array, and then pass that array, as well as its length to a recursive merge sort function.

This, I believe, is the easiest and most straight forward way to do this. However, I do not like the fact that this helper function will take an extra n (length) steps of work to sort coordArray before even calling merge_sort.. Is it possible to somehow sort these coordinates by just passing the two struct pointers into a recursive merge_sort function?

Possible function signature:

void merge_sort(struct origin *origin_coord, struct coordinate **coordArray, int length);

One problem I see with the approach I am trying to implement here is how to deal with sorting the x and y components separately (i.e., coordArray[I]->x, coordArray[I]->y) for the x and y in struct origin (i.e., origin_coord->x_naught, origin_coord->y_naught).

I understand that O(2n log(n)) (or maybe even O(3nlogn) to convert the sorted magnitude array back into its original form: (x, y)) for the merge_sort and helper function really isn't that big of a deal, but if I'm sorting 1 billion coordinates, that extra 2 billion steps of work might matter.


Solution

  • If you cannot store the distance or its square in the coordinate structure, I would suggest you pass the origin structure as an argument to your mergesort_coords function and recompute the square of both distances in the comparison function. This computation is quite cheap as simple compared to the overhead of allocating and handling ancillary structures. Furthermore, the time complexity is unchanged at O(N.log(N)).

    Regarding the x and y ordering, you should specify how to order coordinates that have the same distance to the origin. There are many options for this but the comparison function must be transitive and the order should be complete.

    Suggested prototypes:

    void merge_sort(struct coordinate **coordArray, size_t count,
                    int cmp(const struct coordinate *a,
                            const struct coordinate *b,
                            const struct origin *origin_coord),
                    struct origin *origin_coord);
    
    int cmpdistance(const struct coordinate *a,
                    const struct coordinate *b,
                    const struct origin *o)
    {
        long long dxa = (long long)a->x - o->x_naught;
        long long dya = (long long)a->y - o->y_naught;
        long long dxb = (long long)b->x - o->x_naught;
        long long dyb = (long long)b->y - o->y_naught;
        long long da = dxa * dxa + dya * dya;
        long long db = dxb * dxb + dyb * dyb;
        if (da < db) return -1;
        if (da > db) return +1;
        // if points are at the same distance, order by x, then by y
        if (a->x < b->x) return -1;
        if (a->x > b->x) return +1;
        if (a->y < b->y) return -1;
        if (a->y > b->y) return +1;
        return 0;
    }