Search code examples
javascriptjqueryxmlrss

Jquery: parse tag inside CDATA


I'm trying to parse an tag that is nested in the CDATA value:

Basically, i have the following xml:

<item>
            <title>Time Travel Via Wormhole Breaks the Rules of Quantum Mechanics</title>
            <description><![CDATA[<p>Science has done it again everybody! Brace yourselves for this groundbreaking news, freshly determined by physicists: Time travel, if it exists, may have some weird consequences. Gosh, who’d have thunk it? But no, seriously, a recent article suggests that a certain kind of theoretically possible time machine would wreak minor havoc with a firm principle [&#8230;]</p><p>The post <a href="http://blogs.discovermagazine.com/crux/2014/01/16/time-travel-via-wormhole-breaks-the-rules-of-quantum-mechanics/">Time Travel Via Wormhole Breaks the Rules of Quantum Mechanics</a> appeared first on <a href="http://blogs.discovermagazine.com/crux">The Crux</a>.</p>]]></description>
            <content:encoded><![CDATA[<p><a href="https://i.sstatic.net/kUmJM.jpg"><img class="aligncenter  wp-image-3898" alt="time-travel" src="https://i.sstatic.net/kUmJM.jpg" width="600" height="405" /></a></p>
<p>Science has done it again everybody! Brace yourselves for this groundbreaking news, freshly determined by physicists: Time travel, if it exists, may have some weird consequences. Gosh, who’d have thunk it?</p>
<p>As with all speculative science stories, it’s important to keep things in perspective. This finding would have far-reaching and serious consequences for Internet encryption and quantum computers, among other things — assuming these wormholes really do exist. But, equally valid, the fact that this theoretical construction appears to violate known physical laws also suggests that, alas, maybe the particular wormholes in the study just don’t exist.</p>
<p>Whatever tricks the universe has up its sleeve, it’s exciting that we’re able to study even its wackiest possibilities in so much detail. I can&#8217;t wait to see how it turns out (no spoilers, time travelers).</p>
<p><em>Image courtesy <a id="portfolio_link" href="http://www.shutterstock.com/gallery-73592p1.html">Graeme Dawes </a>/ <a id="portfolio_link" href="http://www.shutterstock.com/gallery-551845p1.html">Ilias Strachinis </a>/ Shutterstock</em></p>
<p>The post <a href="http://blogs.discovermagazine.com/crux/2014/01/16/time-travel-via-wormhole-breaks-the-rules-of-quantum-mechanics/">Time Travel Via Wormhole Breaks the Rules of Quantum Mechanics</a> appeared first on <a href="http://blogs.discovermagazine.com/crux">The Crux</a>.</p>]]></content:encoded>
</item>

i can parse correctly the title, the description and also the content:encoded tag with all the CDATA value as this:

$(this.data).find('item:lt(3)').each(function(index) {
            var e = $(this);
            console.log(e);
            var category    = e.find('category').text();
            var link        = e.find('link').text();
            var title       = e.find('title').text();
            var summary     = e.find('description').text().substring( 0, 120 ) + "...";
            var content     = e.find('encoded').text();
            var image       HOW TO EXTRACT
            alert(image);

What i'm missing is the URL of the image that unluckly is not as in some others RSS feed into a specific elemtent: EX:

<enclosure type="image/jpeg" url="http://www.nwzonline.de/rw/NWZ_CMS/NWZ/2011-2013/Produktion/2014/01/17/SPORT/2/Bilder/generated/SPORT_1_8d5e0b63-8d51-4e87-8249-58eab44cc923--600x337--280x158.jpg"></enclosure>
var image = e.find('img').attr('url');

But is inside the CDATA. Any ide how could i extract the src value from it? I need to obtain: "http://blogs.discovermagazine.com/crux/files/2014/01/time-travel.jpg

Thanks so Much.


Solution

  • There are no tags inside the CDATA. CDATA means "the stuff in here contains things that look like tags, but they are not tags, they are ordinary character data". That's the only purpose of CDATA, to say that there are no tags inside; if you want the tags treated as tags, don't mislabel them as CDATA.

    If someone else has made this mistake and you have to correct it, then the only way is to extract the string inside the CDATA tags and pass it to an XML parser for parsing into a tree.