Search code examples
phpregexperlstring-search

Replace repeating strings in a string


I'm trying to find (and replace) repeated string in a string.

My string can look like this:

Lorem ipsum dolor sit amet sit amet sit amet sit nostrud exercitation amit sit ullamco laboris nisi ut aliquip ex ea commodo consequat.

This should become:

Lorem ipsum dolor sit amet sit nostrud exercitation amit sit ullamco laboris nisi ut aliquip ex ea commodo consequat.

Note how the amit sit isn't removed since its not repeated.

Or the string can be like this:

Lorem ipsum dolor sit amet () sit amet () sit amet () sit nostrud exercitation ullamco laboris nisi ut aliquip aliquip ex ea commodo consequat.

which should become:

Lorem ipsum dolor sit amet () sit nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.

So its not just a-z but can also have other (ascii) chars. I'm verry happy if someone can help me with this.

The next step would be to match (and replace) something like this:

2 questions 3 questions 4 questions 5 questions

which would become:

2 questions

The number in the final output can be any number 2,3,4, it doesn't matter. There will only be different numbers in the final example but the words will be the same.


Solution

  • First task solution code:

    <?php
    
        function split_repeating($string)
        {
            $words = explode(' ', $string);
            $words_count = count($words);
    
            $need_remove = array();
            for ($i = 0; $i < $words_count; $i++) {
                $need_remove[$i] = false;
            }
    
            // Here I iterate through the number of words that will be repeated and check all the possible positions reps
            for ($i = round($words_count / 2); $i >= 1; $i--) {
                for ($j = 0; $j < ($words_count - $i); $j++) {
                    $need_remove_item = !$need_remove[$j];
                    for ($k = $j; $k < ($j + $i); $k++) {
                        if ($words[$k] != $words[$k + $i]) {
                            $need_remove_item = false;
                            break;
                        }
                    }
                    if ($need_remove_item) {
                        for ($k = $j; $k < ($j + $i); $k++) {
                            $need_remove[$k] = true;
                        }
                    }
                }
            }
    
            $result_string = '';
            for ($i = 0; $i < $words_count; $i++) {
                if (!$need_remove[$i]) {
                    $result_string .= ' ' . $words[$i];
                }
            }
            return trim($result_string);
        }
    
    
    
        $string = 'Lorem ipsum dolor sit amet sit amet sit amet sit nostrud exercitation amit sit ullamco laboris nisi ut aliquip ex ea commodo consequat.';
    
        echo $string . '<br>';
        echo split_repeating($string) . '<br>';
        echo 'Lorem ipsum dolor sit amet sit nostrud exercitation amit sit ullamco laboris nisi ut aliquip ex ea commodo consequat.' . '<br>' . '<br>';
    
    
    
        $string = 'Lorem ipsum dolor sit amet () sit amet () sit amet () sit nostrud exercitation ullamco laboris nisi ut aliquip aliquip ex ea commodo consequat.';
    
        echo $string . '<br>';
        echo split_repeating($string) . '<br>';
        echo 'Lorem ipsum dolor sit amet () sit nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.';
    
    ?>
    

    Second task solution code:

    <?php
    
        function split_repeating($string)
        {
            $words = explode(' ', $string);
            $words_count = count($words);
    
            $need_remove = array();
            for ($i = 0; $i < $words_count; $i++) {
                $need_remove[$i] = false;
            }
    
            for ($j = 0; $j < ($words_count - 1); $j++) {
                $need_remove_item = !$need_remove[$j];
                for ($k = $j + 1; $k < ($words_count - 1); $k += 2) {
                    if ($words[$k] != $words[$k + 2]) {
                        $need_remove_item = false;
                        break;
                    }
                }
                if ($need_remove_item) {
                    for ($k = $j + 2; $k < $words_count; $k++) {
                        $need_remove[$k] = true;
                    }
                }
            }
    
            $result_string = '';
            for ($i = 0; $i < $words_count; $i++) {
                if (!$need_remove[$i]) {
                    $result_string .= ' ' . $words[$i];
                }
            }
            return trim($result_string);
        }
    
    
    
        $string = '2 questions 3 questions 4 questions 5 questions';
    
        echo $string . '<br>';
        echo split_repeating($string) . '<br>';
        echo '2 questions';
    
    ?>