Search code examples
phpstringpositionmaxdistance

calculate the maximum distance (position) between two pair strings in a file


I want to calculate the maximum distance between two strings in a file.

Str1 and str2 could be the same, like max_dist('.', '.', $text)

Let's say: public static function max_dist($str1, $str2, $text)

Text would be like in wikipedia for example:

Though not explicitly anarchist, they organized by rank and file democracy, embodying a spirit of resistance that has inspired many Anglophone syndicalists.

I want to find the max distance between ',' and '.' in the whole file. Note that the ',' and '.' are repeated through all the text and can be found in pairs or not.

The example should return 125 as the distance between the first comma and the dot.

I'm developing the following code, but only works if there are no repetitions of the strings to find, so far:

  public static function max_dist($str1, $str2, $f)
  {
    $len = strlen($f);
    $max_dist = 0;
    $pos_a = 0; $pos_b = 1;
    while (true) {
      $a = strpos($f, $str1, $pos_a);
      $b = strpos($f, $str2, $pos_b);

      if (!($a && $b)) break;

      if ($a == $b)
        $b = strpos($f, $str2, $a + 1);

      if (!$a || !$b) continue;     

      //if() 
      $abs = abs($a - $b);
      if ($abs > $max_dist) $max_dist = $abs;

      $pos_a = $a + 1;
      $pos_b = $b + 1;
    }

    return $max_dist;
  }

Any ideas? I found function strstr.php but it does not have offset option


Edit to clarify:

Let's say the text is like:

The House of Plantagenet (1154–1485) was the royal house of all the English kings from Henry II to Richard III, including the Angevin kings and the houses of Lancaster and York. In addition to the traditional judicial, feudal and military roles of the king, the Plantagenets had duties to the realm that were underpinned by a sophisticated justice system.

Your function (@bogdan), returns the maximum of all, that is: 246. I want to calculate the maximum of all, but by pairs, in this case should be: 139 (judicial, to system.)


Solution

  • I believe Your function is too complicated. Just use strpos with strrpos:

    $string = "Though not explicitly anarchist, they organized by rank and file democracy, embodying a spirit of resistance that has inspired many Anglophone syndicalists.";
    
    function max_dist($needle1, $needle2, $string)
    {
        $first = strpos($string, $needle1);
        $last = strrpos($string, $needle2);
    
        return $last - $first;
    }
    
    echo max_dist(',', '.', $string);
    

    You might need to swap parameters ant check second time, if . can be before ,. Or use abs function.

    Update

    If I understand correctly. Then You should keep string that was not processed:

    $string = "The House of Plantagenet (1154–1485) was the royal house of all the English kings from Henry II to Richard III, including the Angevin kings and the houses of Lancaster and York. In addition to the traditional judicial, feudal and military roles of the king, the Plantagenets had duties to the realm that were underpinned by a sophisticated justice system.";
    
    function max_dist($needle1, $needle2, $string)
    {
        $distances = [];
        $string_left = $string;
        while (strpos($string_left, $needle1) !== false && strpos($string_left, $needle2) !== false) {
            $first = strpos($string_left, $needle1);
            $last = strpos($string_left, $needle2);
            $distance = abs($last - $first);
            $distances[] = $distance;
            $offset = max([$first, $last]);
            $string_left = substr($string_left, $offset);
        }
        return $distances;
    }
    
    echo max(max_dist(',', '.', $string));