Search code examples
phpregexpreg-replace-callback

How to use preg_replace_callback if i want to use same "string" twice on a callback?


I wrote a BBcode function that finds a match and does the replacement. However, if i need to use same preg_match it is not returning the match correctly.

the code is:

<?php

//CODE EXAMPLE

class BBCode {

protected $bbcode = array();

	public function __construct() {
		// Replace [div class="class name(s)"]...[/div] with <div class="...">...</div>
		$this->bbcode["/\[div class=\"([^\"]+)\"\](.*?)\[\/div\]/is"] = function ($match) {
			return "<div class=\"$match[1]\">$match[2]</div>";
		};	
	}

	public function rander($str) {
		foreach ($this->bbcode as $key => $val) {
			$str = preg_replace_callback($key, $val, $str);
		}
		return $str;
	}
}

?>

if i use just one tag it works fine! like that:

$str= "[div class="class1"]this is a div[/div]";

even if i use different tags it works great.

$str= "[div class="class1"][p]this is a  paragraph inside a div[/p][/div]";

but when I try to use :

$str = "[div class="class1"][div class="class2"]A div inside a div[/div][/div]";

it is not working and the output is:

<div class="class1">[div class="class2"]div inside a div</div>[/div]

istead of:

<div class="class1"><div class="class2">div inside a div</div></div>

How can i fix it to work correctly ?

Thanks!

A link to the whole bbcode class code on github


Solution

  • Your regexp will match from the first [div class="..."] up to the first [/div] it finds. Therefore, the opening and closing div's do not match (as you can see in your example as well). To prevent this, use a look-around to prevent matching when there is another [div or [/div:

    "/\[div class=\"([^\"]+)\"\]((?!\[div|\[\/div).)*\[\/div\]/is"
    

    Note that this will match only once (only the inner div-pair), so you have to repeat the matching until nothing is found anymore.

    Demo:

    $str = "a[div class=\"b\"]c[div class=\"d\"]e[/div]f[/div]g";
    $reg = "/\[div class=\"([^\"]+)\"\]((?!\[div|\[\/div).)*\[\/div\]/is";
    preg_match($reg, $str, $matches);
    var_dump($matches);
    

    will output:

    array(3)
    {
        [0]=>
        string(22) "[div class="d"]e[/div]"
        [1]=>
        string(1) "d"
        [2]=>
        string(1) "e"
    }
    

    Edit: Yes, as you commented, it replaces only the first match. The * is at the wrong place. Try this code:

    $str = 'a[div class="b"]c[div class="d"]e[/div]f[/div]g';
    $reg = "/^(.*)(\[div class=\"([^\"]+)\"\])((?!\[div|\[\/div).*)\[\/div\](.*)$/is";
    
    $result = $str;
    $step = 1;
    echo "step 0: $result\n";
    do {
        $result = preg_replace($reg, "$1<div class=\"$3\">$4</div>$5", $result, -1, $count);
        echo "step $step: $result\n";
        $step++;
    } while ($count > 0);
    

    This outputs:

    step 0: a[div class="b"]c[div class="d"]e[/div]f[/div]g
    step 1: a[div class="b"]c<div class="d">e[/div]f</div>g
    step 2: a<div class="b">c<div class="d">e</div>f</div>g
    step 3: a<div class="b">c<div class="d">e</div>f</div>g
    

    Note: right now, it is matching one time too often, the loop is not optimal.