(I didn't know how to properly label my problem properly so don't mind the title).
I'm working on a java DOM Parser to parse an rss feed and output a file (.xml). It all works except for one major component. (It uses Jsoup for some parts)
In the <content:encoded>
tag (the article body) it has to change all the <iframe>
tags to <a>
tags and sets the visual part of it to the thumbnail of the video that was contained in the iframe tag.
This is the code that gets converts the tags and changes the html of the tags.
String html = theString;
org.jsoup.nodes.Document docHtml = Jsoup.parse(html);
Elements body = docHtml.select("body");
Elements iframes = body.select("iframe");
iframes.tagName("a");
iframes.removeAttr("width");
iframes.removeAttr("height");
iframes.removeAttr("allowfullscreen");
iframes.removeAttr("frameborder");
//iframes.attr("href", youtubeURL);
for(int k=0; k<1; k++) {
String[] array;
String[] array1;
array = new String[10];
array1 = new String[10];
String youtubeID = "";
String link = "";
array[k] = iframes.attr("src");
//System.out.println(array[k]);
String pattern = "(?<=watch\\?v=|/embed/)[^&#]*";
Pattern compiledPattern = Pattern.compile(pattern);
Matcher matcher = compiledPattern.matcher(array[k]);
while(matcher.find()){
array1[k] = matcher.group();
//System.out.println(matcher.group());
//This is the line in question
iframes.html("<img src=\"http://img.youtube.com/vi/"+array1[k]+"/0.jpg\"/></br>Tap to play video");
System.out.println(iframes);
}
All the parsing works, and I can successfully get the Youtube ID out of the iframe tag using regex. But if the post has multiple videos, rather than giving inserting all the right ID's, it inserts only the ID of the first video in that post.
So instead of (Please excuse the formatting)
a src="http://www.youtube.com/embed/5CzKyR6jzyw"><img src="http://img.youtube.com/vi/5CzKyR6jzyw/0.jpg" /><br />Tap to play video</a>
It gives (Note the img src attribute)
<a src="http://www.youtube.com/embed/qxur7H_CtM0"><img src="http://img.youtube.com/vi/qxur7H_CtM0/0.jpg" /><br />Tap to play video</a>
<a src="http://www.youtube.com/embed/nQl1Y5suqP4"><img src="http://img.youtube.com/vi/qxur7H_CtM0/0.jpg" /><br />Tap to play video</a>
<a src="http://www.youtube.com/embed/H47WhjHcBSw"><img src="http://img.youtube.com/vi/qxur7H_CtM0/0.jpg" /><br />Tap to play video</a>
<a src="http://www.youtube.com/embed/UMr6_ODZsFg"><img src="http://img.youtube.com/vi/qxur7H_CtM0/0.jpg" /><br />Tap to play video</a>
<a src="http://www.youtube.com/embed/u8qzrBcont8"><img src="http://img.youtube.com/vi/qxur7H_CtM0/0.jpg" /><br />Tap to play video</a>
<a src="http://www.youtube.com/embed/0283IhwTWd4"><img src="http://img.youtube.com/vi/qxur7H_CtM0/0.jpg" /><br />Tap to play video</a>
<a src="http://www.youtube.com/embed/HOgnsaixbwE"><img src="http://img.youtube.com/vi/qxur7H_CtM0/0.jpg" /><br />Tap to play video</a>
I'm pretty sure it's something really simple, and I'm just missing it.
Edit: thanks irrelephant (For fixing the formatting)
Again, please excuse my lack of details and/or making something really simple sound confusing, but I don't know how to properly express the problem at hand.
Solved it!
I changed the way Jsoup got the URL's. Before, it was only getting the URL of the first element, and I overlooked it.
So I changed
array[k] = iframes.attr("src");
to:
for (Element e : body.select("iframe")) {
//This gets individual elements, rather than the first one of each post.
array[k] = e.attr("src");
String pattern = "(?<=watch\\?v=|/embed/)[^&#]*";
Pattern compiledPattern = Pattern.compile(pattern);
Matcher matcher = compiledPattern.matcher(array[k]);
while(matcher.find()){
array1[k] = matcher.group();
e.html("<img src=\"http://img.youtube.com/vi/"+array1[k]+"/0.jpg\"/></br>Tap to play video");
//System.out.println(iframes);
}
(There are other changes, but this is from the code mentioned in the original post).
Now it outputs (Like it's supposed to, but obviously I made it change the src
attribute to href
, other wise that would be silly and counter productive):
<a src="http://www.youtube.com/embed/qxur7H_CtM0"><img src="http://img.youtube.com/vi/qxur7H_CtM0/0.jpg" /><br />Tap to play video</a>
<a src="http://www.youtube.com/embed/nQl1Y5suqP4"><img src="http://img.youtube.com/vi/nQl1Y5suqP4/0.jpg" /><br />Tap to play video</a>
<a src="http://www.youtube.com/embed/H47WhjHcBSw"><img src="http://img.youtube.com/vi/H47WhjHcBSw/0.jpg" /><br />Tap to play video</a>
<a src="http://www.youtube.com/embed/UMr6_ODZsFg"><img src="http://img.youtube.com/vi/UMr6_ODZsFg/0.jpg" /><br />Tap to play video</a>
<a src="http://www.youtube.com/embed/u8qzrBcont8"><img src="http://img.youtube.com/vi/u8qzrBcont8/0.jpg" /><br />Tap to play video</a>
<a src="http://www.youtube.com/embed/0283IhwTWd4"><img src="http://img.youtube.com/vi/0283IhwTWd4/0.jpg" /><br />Tap to play video</a>
<a src="http://www.youtube.com/embed/HOgnsaixbwE"><img src="http://img.youtube.com/vi/HOgnsaixbwE/0.jpg" /><br />Tap to play video</a>