Search code examples
phphtmlregexpreg-replacepreg-replace-callback

Convert html headlines to list elements in PHP


I am learning php language. I want to show the table of contents for the article. Convert the headings (h2,h3,h4,...) into a list and create links. This is my php code.

$Post = '
<h2>Title 01</h2>
<h3>Title 01.01</h3>
<h3>Title 01.02</h3>
<h2>Title 02</h2>
<h3>Title 02.02</h3>
';

$c = 1;
$r = preg_replace_callback('~<h*([^>]*)>~i', function($res) use (&$c){
    return '<li><a id="#id'.$c++.'">'.$res[1].'</a></li>';
}, $Post);
$Post = $r;


echo '<ul>';
echo $Post;
echo '</ul>';

The output shows as below, but the above code works wrongly.

<ul>
<li><a id="#id1">2</a></li>Title 01<li><a id="#id2">/h2</a></li>
<li><a id="#id3">3</a></li>Title 01.01<li><a id="#id4">/h3</a></li>
<li><a id="#id5">3</a></li>Title 01.02<li><a id="#id6">/h3</a></li>
<li><a id="#id7">2</a></li>Title 02<li><a id="#id8">/h2</a></li>
<li><a id="#id9">3</a></li>Title 02.02<li><a id="#id10">/h3</a></li>
</ul>

I know that the PHP code is written incorrectly.‌ But i want to show the output as below.

<ul>
<li><a href="#id1">Title 01</a></li>
<li><a href="#id2">Title 01.01</a></li>
<li><a href="#id3">Title 01.02</a></li>
<li><a href="#id4">Title 02</a></li>
<li><a href="#id5">Title 02.02</a></li>
</ul>

Solution

  • Your regular expression is needlessly complex.

    You could just use <h.>(.*)</h.> to correctly match what you are trying to match.

    I added it to your snippet above to show your desired result:

    $post = '
    <h2>Title 01</h2>
    <h3>Title 01.01</h3>
    <h3>Title 01.02</h3>
    <h2>Title 02</h2>
    <h3>Title 02.02</h3>
    ';
    
    $c = 1;
    $list_elements = preg_replace_callback('~<h.>(.*)</h.>~i', function($res) use (&$c){
        return '<li><a id="#id'.$c++.'">'.$res[1].'</a></li>';
    }, $post);
    
    
    echo '<ul>';
    echo $list_elements;
    echo '</ul>';
    

    Although, as suggested in the comments, you should probably use a parser here, if this turns into anything more than a toy example. Then regular expressions are almost always a sure way to shoot yourself in the foot.