Search code examples
phphtmlregexxmlezpublish

Regex - Replacing content - eZ Publish XML field


I have an Xml content that i want to modify before using the eZ Publish 5 API to create it.

I am trying to implement a Regex to modify the content.

Here is the Xml code that i have (with html entities) :

Print of Xml code http://img15.hostingpics.net/pics/453268xmlcode.jpg

I want to be able to catch empty.jpg in :

<img alt="" src="http://www.asite.org/empty.jpg" />

And replace the whole line for each occurrence by :

<custom name="my_checkbox"></custom>

Problem :

The img tag can sometimes contain other attributes like : height="15" width="12"

&lt;img height="15" alt="" width="12" src="http://www.asite.org/empty.jpg" /&gt;

And sometimes the attributes are after the src attribute in a different order.

The aim would be :

Xml code - Aim http://img15.hostingpics.net/pics/318980xmlcodeaim.jpg

I've tried many things so far but nothing worked.

Thanks in advance for helping.

Cheers !

EDIT :

Here is an example of what i've tried so far :

/(&lt;img [a-z = ""]* src="http:\/\/www\.asite\.org\/empty\.jpg" \/&gt)/g

Solution

  • Dealing with XML i've used an XML parser to reach the desired section.

    Then we can apply a regex (~<img.*?>(?=</span)~) to select and replace the image tag with your custom tag (note that in the object received by the xml parser the html entities are replaces with their equivalent char).

    This is a piece of code that emulates and handle your situation:

    <?php
    $xmlstr = <<<XML
    <sections>
      <section>
        <paragraph>
          <literal class="html">
            &lt;img alt="" src="http://asite.org/empty.png" /&gt;&lt;/span&gt;&lt;/span&gt; Yes/no&amp;nbsp;&lt;br /&gt;
            &lt;img alt="" src="http://asite.org/empty.png" /&gt;&lt;/span&gt;&lt;/span&gt; Other text/no&amp;nbsp;&lt;br /&gt;
          </literal>
        </paragraph>
      </section>
    </sections>
    XML;
    
    $sections = new SimpleXMLElement($xmlstr);
    
    foreach ($sections->section->paragraph as $paragraph) {
      $re = "~<img.*?>(?=</span)~";
      $subst = "<custom name=\"my_checkbox\"></custom>";
      $paragraph->literal = preg_replace($re, $subst, $paragraph->literal);
    }
    
    echo $sections->asXML();
    
    ?>
    

    The output is:

    <?xml version="1.0"?>
    <sections>
      <section>
        <paragraph>
          <literal class="html">
            &lt;custom name="my_checkbox"&gt;&lt;/custom&gt;&lt;/span&gt;&lt;/span&gt; Yes/no&amp;nbsp;&lt;br /&gt;
            &lt;custom name="my_checkbox"&gt;&lt;/custom&gt;&lt;/span&gt;&lt;/span&gt; Other text/no&amp;nbsp;&lt;br /&gt;
          </literal>
        </paragraph>
      </section>
    </sections>
    

    An online demo can be found HERE