php regex random replace preg-replace-callback

Replace whole words in a string with a random word from an array of grouped words

$myVar = 'essa pizza é muito gostosa';

$myWords=array(
    array('sabor','gosto','delicia'),
    array('saborosa','gostosa','deliciosa'),
);

foreach($myWords as $words){
    // randomize the subarray
    shuffle($words);
    // pipe-together the words and return just one match
    if(preg_match('/\b\K'.implode('|',$words).'\b/',$myVar,$out)){
        // generate "replace_pair" from matched word and a random remaining subarray word
        // replace and preserve the new sentence
        $myVar=strtr($myVar,[$out[0]=>current(array_diff($words,$out))]);
    }
}
echo $myVar;

Should be replaced by:

$myVar = 'essa pizza é muito deliciosa';

$myVar = 'essa pizza é muito saborosa';

But you are trading for smaller word keys, just because this smaller key also contains all the letters of this larger key!

The Output is happening is wrong:

$myVar = 'essa pizza é muito saborsa';

"saborsa" (this word does not exist in Portugues and too in my arrays)!

Is cutting the word "gostosa", to put the word "sabor" Then, instead of entering the word "saborosa," is forming the word that does not exist: "saborsa". A part of the word "gostosa" = "sa" + word "sabor" = "saborsa" (this word does not exist) has to be "saborosa".

The big problem is to consider a part of the word "gostosa" As being the word "gosto"

How to read the full key / word before doing the replacement? Thanks

Solution

I have come across my original answer years after the fact and realized that is was not built to make multiple random replacements. I have scrubbed the old answer to replace it with a more robust technique.

Code: (Demo)

$myVar = 'essa pizza é muito gostosa, gostosa, gostosa, gosto';

$myWords = [
    ['sabor', 'gosto', 'delicia'],
    ['saborosa', 'gostosa', 'deliciosa'],
];

$grouped = [];
$flipped = [];
foreach ($myWords as $row) {
    $grouped[] = '(' . implode('|', $row) . ')';
    $flipped[] = array_flip($row);
}
$pattern = '/\b(?:' . implode('|', $grouped) . ')\b/';

var_export(
    preg_replace_callback(
        $pattern,
        function($m) use ($flipped) {
            array_shift($m);
            foreach ($m as $i => $captured) {
                if ($captured) {
                    unset($flipped[$i][$captured]);
                    return array_rand($flipped[$i]);
                }
            }
        },
        $myVar
    )
);

Potential Output:

'essa pizza é muito deliciosa, saborosa, deliciosa, sabor'

Data Preparation:

Form an array of pipe-delimited capture groups for each set of words ($grouped) -- these strings will comprise the central portion of the regex pattern.
Form an array where the subarray values become the respective subartay keys -- this will make accessing the random replacement words simpler/cleaner.
Form a regex pattern which glues together the piped-delimited, parenthetically-wrapped strings with more pipes, then wrap that string with a non-capturing group, then wrap that with word boundaries so that only whole words are matched. The generated pattern for the sample data is:
```
/\b(?:(sabor|gosto|delicia)|(saborosa|gostosa|deliciosa))\b/
```

Replacement Execution:

Using the generated pattern to match whole words from any subarray in $myWords as well as $flipped lookup array, the custom callback will receive an array of match values.

$m[0] will be the fullstring match. While it holds the desired value, it does not inform us which subarray the match came from. For this reason $m[0] is omitted from the array.

If the matched word came from the first set of words, then $m[1] will have a non-empty string. This captured word will be removed from $flipped to eliminate the possibility of replacing itself with itself.

Finally, array_rand() is used to extract one of the remaining words from the relevant subarray. This random selection becomes the word that is used as the replacement.

Oh, and the foreach() in the callback will keep iterating until it finds a non-empty string. In other words, if the captured word was in the second subarray, it would ignore [0] (when $i === 0, then take action when $i === 0.

preg_replace_callback() is not assigned a limit, so it will make as many replacements as it can, but it will only make one pass over the string. This means that it will not replace replacements.