Search code examples
phpregexpreg-replacepreg-replace-callback

preg_replace_callback() memory issue


i'm having a memory issue while testing a find/replace function.

Say the search subject is:

$subject = "

I wrote an article in the A+ magazine. It'\s very long and full of words. I want to replace every A+ instance in this text by a link to a page dedicated to A+.

";

the string to be found :

$find='A+';
$find = preg_quote($find,'/');

the replace function callback:

 function replaceCallback($match)
    {
      if (is_array($match)) {
          return '<a class="tag" rel="tag-definition" title="Click to know more about ' .stripslashes($match[0]) . '" href="?tag=' . $match[0]. '">' . stripslashes($match[0])  . '</a>';
      }
    }

and the call:

$result = preg_replace_callback($find, 'replaceCallback', $subject);

now, the complete searched pattern is drawn from the database. As of now, it is:

$find = '/(?![^<]+>)\b(voice recognition|test project reference|test|synesthesia|Superflux 2007|Suhjung Hur|scripts|Salvino a. Salvaggio|Professional Lighting Design Magazine|PLDChina|Nicolas Schöffer|Naziha Mestaoui|Nabi Art Center|Markos Novak|Mapping|Manuel Abendroth|liquid architecture|LAb[au] laboratory for Architecture and Urbanism|l'Arca Edizioni|l' ARCA n° 176 _ December 2002|Jérôme Decock|imagineering|hypertext|hypermedia|Game of Life|galerie Roger Tator|eversion|El Lissitzky|Bernhard Tschumi|Alexandre Plennevaux|A+)\b/s';

This $find pattern is then looked for (and replaced if found) in 23 columns across 7 mysql tables.

Using the suggested preg_replace() instead of preg_replace_callback() seems to have solved the memory issue, but i'm having new issues down the path: the subject returned by preg_replace() is missing a lot of content...

UPDATE:

the content loss is due to using preg_quote($find,'/'); It now works, except for... 'A+' which becomes 'A ' after the process.


Solution

  • Alright - I can see, now, why you're using the callback

    First of all, I'd change your callback to this

    function replaceCallback( $match )
    {
        if ( is_array( $match ) )
        {
            $htmlVersion    = htmlspecialchars( $match[1], ENT_COMPAT, 'UTF-8' );
            $urlVersion     = urlencode( $match[1] );
            return '<a class="tag" rel="tag-definition" title="Click to know more about ' . $htmlVersion . '" href="?tag=' . $urlVersion. '">' . $htmlVersion  . '</a>';
        }
        return $match;
    }
    

    The stripslashes commands aren't going to do you any good.

    As far as addressing the memory issue, you may want to break down your pattern into multiple patterns and execute them in a loop. I think your match is just too big/complex for PHP to handle it in a single call cycle.