Search code examples
phpregexword-wrap

How to wrap all fragments (that are not inside of <div> or <h>) with <p>?


I have a code:

$html = 
'<h1>Headline</h1>
That should be paragraph 1.

That should be paragraph 2.

<div>Something.</div>
That should be paragraph 3.
';

And I want to change it into:

$html = 
'<h1>Headline</h1>
<p>That should be paragraph 1.</p>

<p>That should be paragraph 2.</p>

<div>Something.</div>
<p>That should be paragraph 3.</p>
';

But how to achieve it in PHP?

I was thinking about replacing each empty line with <p>, </p>, or </p><p>, based on some counter of open/close paragraphs. But that would force user to make such new lines after every </h1>.

Maybe some regular expression would do the job without limiting user to "you have to make new lines after </h1> and before <h1>"?

The more general the approach is, the better.


Solution

  • The below negative lookahead asserts that the line won't contain <h1> or <div> tags. If yes, then it capture the contents of whole line into group 1.

    ^(?!.*?(?:<h1>|<div>))(.+)$
    

    Replacement string:

    <p>$1</p>
    

    DEMO

    Code:

    <?php
    $string = <<<EOT
    <h1>Headline</h1>
    That should be paragraph 1.
    
    That should be paragraph 2.
    
    <div>Something.</div>
    That should be paragraph 3.
    EOT;
    echo preg_replace('~^(?!.*?(?:<h1>|<div>))(.+)$~m', '<p>$1</p>', $string)
    ?>
    

    Output:

    <h1>Headline</h1>
    <p>That should be paragraph 1.</p>
    
    <p>That should be paragraph 2.</p>
    
    <div>Something.</div>
    <p>That should be paragraph 3.</p>