Search code examples
c++algorithmmatrixgreedy

How to solve this matrix problem to collect all the seeds?


Problem description:

Hammy the hamster found a building with seeds. Help him collect them all.

The building is arranged in a matrix of rooms, with seeds in some rooms. Once in the building, Hammy will only run in a straight line, all the way through the building and collects all the seeds on that line. The building has many entrances and Hammy can enter at any row or column as he wishes.

He wants to collect all the seeds in the building with as few runs as possible.

Here is a sample layout of the building:

+---+---+---+---+---+
| * |   | * |   |   |
+---+---+---+---+---+
| * | * |   |   |   |
+---+---+---+---+---+
| * |   |   |   | * |
+---+---+---+---+---+
|   |   |   | * |   |
+---+---+---+---+---+
|   |   |   | * |   |
+---+---+---+---+---+

This building has 25 rooms (5X5) but only 8 rooms have seeds (rooms containing '*').

We need to find the minimum runs to collect all the seeds.

My approach:

I tried to solve it with some Greedy Approach, here it is:

1) Start with row/column that contains maximum rooms of seeds (For example: column 1 for here).

2) Updating rows/columns (no of rooms that contain seeds).

3) Repeating steps 1 & 2 until all the seeds are collected.

So, for this example when I apply my approach,

It starts with Column 1 (it contains 3 rooms),

after updating, next will be Column 4,

Here is the building situation right now,

+--+---+---+--+---+
|  |   | * |  |   |
+--+---+---+--+---+
|  | * |   |  |   |
+--+---+---+--+---+
|  |   |   |  | * |
+--+---+---+--+---+
|  |   |   |  |   |
+--+---+---+--+---+
|  |   |   |  |   |
+--+---+---+--+---+

Now three rooms contain seeds and all on different rows or columns, I needs to take all of them, as a result I finished with 5 runs, which is the maximum (taking all rows or all columns is always an answer withing thinking about the seeds at all). Without any calculation anyone can reach here. Runs containing all rows or columns.

But this problem can be solved in 4 runs.

I'm looking for an approach/algorithm to find the minimum runs (which rows and/or columns I need to run).


Solution

  • The matrix can form a bipartite graph, all rows can form the vertices on left and all columns on the right. If there are seeds in a cell, it will establish an edge between left & right. So, the given example will form the following bipartite graph.

    enter image description here

    Now all we need to select a set of vertices so that at least one end point of every edge will be included on that vertex set. In graph theory this is known as vertex cover problem. https://en.wikipedia.org/wiki/Vertex_cover

    To achieve minimum vertex cover we need to find the maximum matching of the graph. Hopcroft–Karp algorithm provides us to find a maximum matching in polynomial time (in worst case O(n^2.5). Once we have a maximum matching, Kőnig's theorem helps us to find the minimum vertex cover in O(n^2) time.

    How it works:

    Say M is the maximum matching, initially it is set to empty. From the bipartite graph, we need to find maximal set of vertex-disjoint shortest augmenting paths P.

    here P = {r1c1, r2c2, r3c5, r4c4}

    M = M (symmetric difference) P = {r1c1, r2c2, r3c5, r4c4}

    in this example there are no more augmenting path remaining. So the maximum matching is achieved.

    Now using Kőnig's theorem we need to find the minimum vertex cover from the matching M. Here the vertex cover will be {r1, r2, r3, c4}

    So the minimum run is 4.
    

    Overall complexity of this process is O(E * (V^0.5)), E & v are the length of set of edges and vertices. For a sparse matrix it will run in near-linear time.

    Another example:

    +---+---+---+---+---+
    |   | * | * |   | * |
    +---+---+---+---+---+
    |   |   |   | * |   |
    +---+---+---+---+---+
    | * | * |   | * |   |
    +---+---+---+---+---+
    |   |   |   | * |   |
    +---+---+---+---+---+
    

    Bipartite graph:

    enter image description here

    Maximum matching M = {r1c2, r2c4, r3c1}

    Minimum vertex cover = {r1, r3, c4}

    So the minimum run is 3.
    

    If you are only interested about the minimum run (number only, no need to show details) then you don't need to implement Kőnig's theorem, because the size of maximum matching is the length of minimum vertex cover set.