Search code examples
arraystraversalpath-finding

All non-overlapping groups


I'm trying to extract all possible non-overlapping groups from a nested array to produce the least number of groups

{
  0:  4, 5, 6
  1:  4, 5
  2:  3, 4 ,9
  3:  8, 9
  4:  8,10,11
  5:  8,10,11
}

Moving from 0->5, here are the best possible outcomes:

  1. [ [4,4,4],[8,8,8] ]
  2. [ [5,5],[9,9],[10,10] ]
  3. [ [5,5],[9,9],[11,11] ]

1 is the better group, because it only has two sets.

I know how to do this in several loops, but is there any way I can do this in a single pass? It feels like it might be a pathfinding problem, but not sure how to convert it into one.


Solution

I devised a method based on libik's answer below (using JS) in only 19 cycles!

function bestPath(){
    var array = [[4,5,6],[4,5],[3,4,9],[8,9],[8,10,11],[8,10,11]],
        arrayOfIndexes = (function (){
            var arr = [];
            for (var k=0; k < array.length; k++){
                arr.push({
                    index: array[k].length-1,
                    last_split: 0
                })
            }
            //dummy index, needed for jumping back
            arr.push({index:0, last_split:0});  

            return arr;
        })();


    // This function clones the current working path
    // up to a specific index (which we then use to 
    // rebuild upon by taking another route)
    var cloneTill = function(array, till){
        var new_arr = [];
        for (var l=0; l < till; l++)
            new_arr.push( array[l] );
        return new_arr;
    };

    var numset = 0,             // running counter of num. sets in working path
        bestset = 99999;

    var path_list = [],       // array of all paths
        best_path = null,     //
        temppath = [];        // current working path

    var row = 0,
        min_stretch_len = 2; // minimum length heuristic

    while (true){
        var color_index = arrayOfIndexes[row];
        console.log("path="+temppath);

var jumpBack = false;

        if (row === 0){
            if (color_index.index < 0) break;                   //No more paths to explore after jumping back.
            numset = 0;
        }

        if (row === array.length){

            if (numset < bestset){
                bestset = numset;
                best_path = temppath;
            }
            path_list.push ( temppath );

            jumpBack = true;
        }

        if (color_index.index < 0) jumpBack = true;


        if (jumpBack){
//          console.log( ">>jumprow:"+row+"-->"+color_index.last_split
//                        +", index to:"+(arrayOfIndexes[color_index.last_split].index - 1)+'\n');

            // jump back to last split  
            row = color_index.last_split;
            temppath = cloneTill(temppath, row);

            arrayOfIndexes[row].index--;        
            continue;
        }

        //We have an unexplored color
        var color = array[row][color_index.index];
//          console.log("    trying color='"+color+"'");


        //Perform lookahead
        var stretch = row;
        while ( stretch < array.length  && array[stretch].indexOf(color)!== -1){
            temppath.push(color);
            stretch ++;
        }
        stretch -= row;

        // Unsuccessful
        if (stretch < min_stretch_len){
//          console.log("    failed (too short)");
            while(stretch --> 0) temppath.pop();    // clear changes

            arrayOfIndexes[row].index--;            // next attempt at this row will try a different index          
            continue;
        }

        // Successfully found a new color. Splitting
//      console.log("    worked, arrayOfIndexes["+(row+stretch)+"].last_split = "+row);
        arrayOfIndexes[row+stretch].last_split = row; // this row is where we split

        row += stretch;
        numset ++;
    }   
    console.log("sols", path_list);
    console.log("best path=", best_path);
}

Solution

  • I think it is not possible in O(n) time (see the comment).

    However you can save some time by solving this with backtracking, remembering the best solution and "cut" the solutions which you know that cant reach the solution better than the best found.


    This is backtracking solution with cutting

    import java.util.LinkedList;
    
    public class JavaApplication12 {
    
        public static void main(String[] args) {
            int[][] array = {{4, 5, 6}, {4, 5}, {3, 4, 9}, {8, 9}, {8, 10, 11}, {8, 10, 11}};
            int[] arrayOfIndexes = new int[array.length];
    
            LinkedList<Integer> solution = new LinkedList<>();
            boolean end = false;
    
            int valueOfSolution = 1;
            int bestValueOfSolution = Integer.MAX_VALUE;
            LinkedList<Integer> bestSolution = new LinkedList<>();
    
            int row = 1;        
    
            while (end == false) {
                if (row == array.length) {
                    if (bestValueOfSolution > valueOfSolution) {
                        bestValueOfSolution = valueOfSolution;
                        bestSolution = (LinkedList<Integer>) solution.clone();
                    }
                    row++;
                } else {
                    if (row > array.length) {
                        row = array.length - 1;
                    }
    
                    if (arrayOfIndexes[0] == array[0].length) {
                        end = true;
                    } else if (array[row].length == arrayOfIndexes[row] || solution.size() > row || valueOfSolution >= bestValueOfSolution ) {
                        if (valueOfSolution >= bestValueOfSolution && !(array[row].length == arrayOfIndexes[row] || solution.size() > row)){
                            System.out.println("Cutting");
                        }
    
                        boolean decreaseRow = true;
                        if (solution.size() > row) {
                            decreaseRow = false;
                        }
    
                        int lastNumber = solution.removeLast();
    
                        if (solution.isEmpty() || solution.getLast().equals(lastNumber) == false) {
                            valueOfSolution--;
                        }
    
                        if (decreaseRow) {
                            arrayOfIndexes[row] = -0;
                            row--;
                        }
                    } else {
                        if (!solution.isEmpty() && array[row][arrayOfIndexes[row]] != solution.getLast()) {
                            valueOfSolution++;
                        }
    
                        if (solution.isEmpty()){
                            valueOfSolution = 1;
                        }
    
                        solution.add(array[row][arrayOfIndexes[row]]);
                        arrayOfIndexes[row]++;
                        row++;
                    }
                }
            }
    
            System.out.println("Best solution is: " + bestSolution);
            System.out.println("It has value of: " + bestValueOfSolution);
        }
    }
    

    The output of this example is

    Cutting
    Cutting
    Cutting
    Cutting
    Cutting
    Cutting
    Cutting
    Cutting
    Cutting
    Cutting
    Cutting
    Cutting
    Cutting
    Cutting
    Cutting
    Cutting
    Cutting
    Cutting
    Cutting
    Best solution is: [4, 4, 4, 8, 8, 8]
    It has value of: 2
    

    I added a "numberOfSteps" variable to count how many times the whyle cycle is called.

    With Cutting I got

    Number of steps:62
    

    Without cutting!!

    Number of steps:1314
    

    The cutting is provided by this condition valueOfSolution >= bestValueOfSolution. We are looking for the smallest number, right? So as we are constructing a solution by adding numbers from each row, if we already have same or bigger score, we cant go better with it, no matter what numbers we add, so we can skip this.


    If you do not know what backtracking is, this is a nice gif of how it is done for sudoku : http://upload.wikimedia.org/wikipedia/commons/thumb/8/8c/Sudoku_solved_by_bactracking.gif/220px-Sudoku_solved_by_bactracking.gif

    It is just trying to add numbers and when it cannot go forward (cause adding any number would break the rules of sudoku) it returns and tries another numbers.