Search code examples
algorithmcombinatoricsprobability-theory

Find repeated triads in arrays


I have a collection of arrays that include numbers from 1 to 10. The size of each array is 5. For example

  [1,2,3,4,5]
  [3,4,9,8,2]
  [1,5,7,9,2]
  [1,2,5,9,7]
  ...........
  [3,8,1,6,9]

What algorithm should I use to find repeated triads in these arrays? For example one of the results should be 1,2,5 since this triad is included in some of the arrays. I don't mind how many times some triad is repeated. I am looking the n most often (could be 3 or 4 or something else).

[1,2,3] is the same with [3,1,2] and each number is allowed only once. [3,3,4] is not valid.

This problem gets harder if we assume arrays of 10 or more numbers, so that each array could have a combination of triads. Just food for thought

 [1,3,5,19,45,11,12,13,9,31]

 [1,3,5,32,46,15,12,18,29,37]

 result : (1,3,5) (1,3,12) (3,5,12) etc

Solution

  • I have completely reviewed my response :

    • **Bugs fixed** in : `array function computeRepettition(array $a);`
      • **Avoid** increment of repetition if triad was already found in pass-1
      • **Return** an array of arrays, and the number of repetition of each triad is set in '**numberOfRepetition**', the triad in self is the key of the array
      • **Support** number composed of 2 digits or more
    • **New** `array function iCombination(array $a);` reduce the number of probability for finding triad, since the order it is not important, and repetition of number is not allowed
    • **Update** of `array function detectRepetition(array $a);` detects all triad that can be found
    <?php
    define ("MIN_LENGTH_VECTOR"     , 3  );
    define ("KEY_SEPERATOR"         , '-');
    
    $src2D = array (
    
         array(1,3,5,19,45,11,12,13,9, 100,31),   
         array(1,3,5,32,46,15,100, 12,18,29,37),
         array(1222,32222,5,3222222,4622222,1522222,100, 12,182222,292222,372222));
    var_dump (computeRepetition ($src2D));
    
    function computeRepetition ($src2D) {
        $repetition = array ();
        for ($i=0 ; $i<count($src2D)-1 ; $i++) {
    
            foreach ($repetition as &$rep) {
                $rep['escape'] = TRUE;
            }
    
            for ($j=$i+1 ; $j<count($src2D) ; $j++) {
                $t = buildTruth ($src2D[$i], $src2D[$j]);
                $r = detectRepetition ($t);
    
                if (is_null ($r)) continue;
    
                $comb = iCombination ($r);
    
                foreach ($comb as $cb) {
                    if (isset ($repetition[$cb]['escape']) &&  $repetition[$cb]['escape'] === TRUE) continue;
                    if (array_key_exists ($cb, $repetition)) {
                        $repetition[$cb]['numberOfRepetition']++;
                    } else {
                        $repetition[$cb]['numberOfRepetition'] = 2;
                        $repetition[$cb]['escape']             = FALSE;
                    }
                }
            }
        }
        return $repetition;
    }   
    
    function detectRepetition ($t) {
        $a = array ();
        foreach ($t as $key => $value) {
            if ($value === TRUE) {
                $a[] = $key;
            }
        }
        if (count($a) < MIN_LENGTH_VECTOR) return NULL; 
        return $a;
    }
    
    function iCombination ($array) {
        $res = array ();
        sort ($array, SORT_NUMERIC);
    
        for ($i = 0 ; $i < count ($array) - 2 ; $i++) {
            for ($k = $i + 1 ; $k < count ($array) - 1 ; $k++) {
                for ($l = $k + 1 ; $l < count ($array) ; $l++) {
                    $res[] = $array[$i] . KEY_SEPERATOR . $array[$k] . KEY_SEPERATOR . $array[$l]; 
                }
            }           
        }
        return $res;
    }
    
    
    function buildTruth ($vec1, $vec2) {
        foreach ($vec1 as $v) {
            $t[$v] = FALSE;
        }
        foreach ($vec2 as $v) {
            if (isset ($t[$v]) && $t[$v] === FALSE ) $t[$v] = TRUE ;
        }
        return $t;
    }