Search code examples

How to remove CDATA blocks inside a script element?

With PHP, in HTML file, I want to remove the CDATA blocks inside a script element.

<script type="text/javascript">
    /* <![CDATA[ */
    var A=new Array();
/* ]]> */
some text2 ........................
some text3 ........................
some text4 ........................
<script type="text/javascript">
    /* <![CDATA[ */
    var B=new Array();
/* ]]> */
some text5 ........................

I haven't found how to select & remove this nodes with XPath & PHP DomDocument.

I tried with this regular expression $re = '/\/\*\s*<!\[CDATA\[[\s\S]*\/\*\s*\]\]>\s*\*\//i';

But this removes all text including the one between 2 blocks of CDATA.

As a result I get an empty string instead of

some text2 ........................ 
some text3 ........................ 
some text4 ........................ 
some text5 ........................

Any ideas?

Update with ThW solution :

With this page, It seems that the text of the CDATA section is not well parsed

$domDoc = new DOMDocument();

$xpath = new DOMXpath($domDoc);
foreach($xpath->evaluate('//text()') as $section) {
  if ($section instanceof DOMCDATASection) {
$content = $domDoc->saveHTML();

I got this textContent

function updateConstructeurs(list) {
    for (var i in list) {
        if(list[i]['thumbnail']) {
            jQuery('#reseau-constructeurs').append('<div class="reseau-constructeur">' +
                '<div class="img" style="background-image:url(' + list[i]['thumbnail'] + ')">


function updateConstructeurs(list) {
    for (var i in list) {
        if(list[i]['thumbnail']) {
            jQuery('#reseau-constructeurs').append('<div class="reseau-constructeur">' +
                '<div class="img" style="background-image:url(' + list[i]['thumbnail'] + ')"></div>' +
                '<h3>' + list[i]['title'] + '</h3>' +
                '<a class="btn purple" href="' + list[i]['link'] + '">Accéder à la fiche</a>' +

And as a result, instead of getting an empty string, we have :

                        '<h3>' + list[i]['title'] + '</h3>' +
                        '<a class="btn purple" href="'%20+%20list%5Bi%5D%5B'link'%5D%20+%20'">Acc&eacute;der &agrave; la fiche</a>' +
    /* ]]&gt; */


  • Make the [\s\S]* non-greedy, i.e. [\s\S]*?:

