Search code examples
javajsoup

Jsoup Not Parsing IFrame out of HTML


Can anyone explain why jsoup is not picking up the iframe in the following html

<div class="video">
<script class="video_preview_source" type="text/html">
<iframe src="//player.vimeo.com/video/109fdsagfa" id="campaign_video_7566" width="353" height="240" frameborder="0"></iframe></script>
<div class="video_preview"></div>
</div>

with this code

Document document = Jsoup.parse(html);

Elements elements = document.select("div.video script.video_preview_source iframe[src]");

System.out.println("elements:" + elements);

Solution

  • I think it's not picking up the <iframe /> as it is not expecting HTML inside a <script /> tag. You need .data() to return the contents.

    Also note: you can't select attributes directly, you will always get a full element in return.

    Splitting all this, the following code works for me:

    Document document = Jsoup.parse(html);
    
    Elements elements =
                document.select("div.video script.video_preview_source");
    
    Document iframeDoc = Jsoup.parse(elements.get(0).data());
    
    Elements iframeElements = iframeDoc.select("iframe");
    
    System.out.println(iframeElements.attr("src"));
    

    Regards, Alexander.