Search code examples
phpregexsimple-html-dom

Replace multiple '   ' with HTML tag


Working on a strange one today - A client will be pasting in a load of text to an HTML editor which adds many non breaking spaces. I need to style this when output to the browser.

I am trying to make the following string

<ul class="columns">
    <li><u>Running costs</u></li>
    <li>Urban mpg&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 50.4 mpg</li>
    <li>Extra Urban mpg&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 72.4 mpg</li>
</ul>

Change to this:

<ul class="columns">
    <li><u>Running costs</u></li>
    <li>Urban mpg<span class="right">50.4 mpg</span></li>
    <li>Extra Urban mpg<span class="right">72.4 mpg</span></li>
</ul>

Using preg replace or an HTML parser. I have tried both. My PHP preg_replace is this:

preg_replace("/(&nbsp;)/", "<span>", $input_lines);

Which replaces all &nspb;'s with . I only want the one span to be added and I need it to close at the end too. I have been experminting with Simple HTML Dom but not sure what functions to use to best achieve this.

Thanks


Solution

  • You could try

    (?:&nbsp;)+([^<]*)
    

    Replace with

    <span class="right">\1</span>
    

    It matches all repeating &nbsp;, and then grabs everything up to < of the closing tag. Replacing it with the above string inserts the capture with the surrounding span.

    See it here at regex101.