Search code examples
phpregexpermutationregex-look-ahead

Regex to collect batches of substring permutations of increasing lengths


I have a string e.g. HRJSHR and search for a way to find all possible permutations of A-Z with the length of 2 or more letters. For example

  • HR, RJ, JS, SH, HR
  • HRJ, RJS, JSH, SHR
  • HRJS, RJSH, JSHR
  • HRJSH, RJSHR
  • HRJSHR

|[A-Z]{2,}| just returns the whole string "HRJSHR" and |[A-Z]{2}| just that ones whith a length of 2 letters. |[A-Z]{2+}| doesn't works.

Which regular expression will find all permutations of A-Z with a length of 2 ore more letters in the string?


Solution

  • Could capture inside a lookahead: (?=([A-Z]{2})) would match all substrings composed of two [A-Z] such as HR,RJ,JS,SH,HR. See test at regex101.

    enter image description here

    Combine it with a loop to get the desired result:

    $str =  "HRJSHR"; $res = array();
    
    for($i=2; preg_match_all('/(?=([A-Z]{'.$i.'}))/', $str, $out); $i++)
      $res[$i] = $out[1];
    
    print_r($res);
    

    See test at eval.in, outputs to:

    Array
    (
        [2] => Array
            (
                [0] => HR
                [1] => RJ
                [2] => JS
                [3] => SH
                [4] => HR
            )
    
        [3] => Array
            (
                [0] => HRJ
                [1] => RJS
                [2] => JSH
                [3] => SHR
            )
    
        [4] => Array
            (
                [0] => HRJS
                [1] => RJSH
                [2] => JSHR
            )
    
        [5] => Array
            (
                [0] => HRJSH
                [1] => RJSHR
            )
    
        [6] => Array
            (
                [0] => HRJSHR
            )
    
    )
    

    For a result without grouping by length, use:
    $res = array_merge($res, $out[1]); instead of $res[$i] = $out[1];