Search code examples
phphtmlreplacehtml-parsingpreg-replace-callback

Add comments and attributes including an incremented number to elements in an HTML string


I have been trying to understand how preg_replace_callback() works, but I just don't get it.

Say for example, I get_contents from navigation.php.

In that text are a bunch of a href and divs and I want to give incremental ids to and add in some code commenting before each a href.

How would I loop over all those so they would all increment and add the ids and commenting?

<?php
$string = file_get_contents("navigation.php");
$i = 1;
$replace = "<a ";
$with = '<!-- UNIT'.$i.' --><a id=a_'.$i;
$replace2 = "<div ";
$with2 = '<div id=b_'.$i;
preg_replace_callback()
$i++

?>

I figured maybe if I could get an example with my code, maybe I would be able to understand it better.

Do $replace and $replace2 are my strings I am searching for and $with and $with2 are the replacements respectively, and $i being the increment.

An example of data coming in:

<a href="page4.php">Page 4</a>
<a href="page3.php">Page 3</a>
<div class="red">stuff</div>
<div class="blue">stuff</div>

I would want an output like..

<!-- UNIT 1 --><a id="a_1" href="page4.php">Page 4</a>
<!-- UNIT 2 --><a id="a_2" href="page3.php">Page 3</a>
<div id="b_1" class="red">stuff</div>
<div id="b_2" class="blue">stuff</div>

Solution

  • You have multiple goals, the simplest way to accomplish them imo is doing it step-by-step.

    1. The RegEx

    You want two HTML tags, these can be caught easily via /(<a|<div)/i (explanation, g modifier is only used to demonstrate that it correctly matches).

    With this you could write the following code:

    $parsed = preg_replace_callback('/(<a|<div)/i', ???, $string);
    

    2. The callback

    The logic behind this can be simplified to the following switch

    switch ($found) {
        case '<div':
            $result = '<div id="b_'.$id.'"';
            break;
        case '<a':
            $result = '<!-- UNIT'.$id.' --><a id="a_'.$id.'"';
            break;
        default:
            $result = "";
            break;
    }
    

    To implement this you can either write a new function or use an anonymous one. To make $id accessible, you need to learn about variable scope in PHP. An easy way out of using anything like global $id; or define() is using Closures with the use() syntax. To be able to manipulate $id (increment it), you'll need to pass it by reference (when using Closures). This brings you to the following code:

    $parsed = preg_replace_callback("/(<a|<div)/", function($match) use (&$id) {
        switch ($match[1]) {
            case '<div':
                $result = '<div id="b_'.$id.'"';
                break;
            case '<a':
                $result = '<!-- UNIT'.$id.' --><a id="a_'.$id.'"';
                break;
            default:
                $result = $match[1];//do nothing
                break;
        }
        $id++;
    
        return $result;
    }, $string);
    

    Watch it work here.