Search code examples
javajsoup

Java jsoup - all links belonging to a specific classes - no output, no error message


I'm learning jsoup in Java.

I'd like to print all links belonging to a specific classes.

I'm not getting any output when my expected output is my beutiful link : /sport/golf/66227692\.

I've looked at Jsoup: How get all the href associated with a specific class, Jsoup - get text from all elements with a particular class under a Specified class, and https://jsoup.org/cookbook/extracting-data/selector-syntax, but I don't know where I'm going wrong.

Thanks for any help.

import java.io.IOException;
import org.jsoup.Jsoup;  
import org.jsoup.nodes.Document;  
import org.jsoup.nodes.Element;  
import org.jsoup.select.Elements;


public class Anyclass {

    public static void main(String[] args) throws IOException {
    
        String html = "<a href=\"/sport/golf/66227692\" class=\"ssrcss-6m4230-PromoLink e1f5wbog1\"><span role=\"text\"><p class=\"ssrcss-6arcww-PromoHeadline e1f5wbog6\"><span aria-hidden=\"false\">McIlroy's 'perfect preparation' for Hoylake Open</span></p></span></a>";
        Document document = Jsoup.parse(html);
        Elements links = document.select("a[href].ssrcss-6m4230-PromoLink e1f5wbog1"); 
        for (Element link : links) {
            System.out.println("my beutiful link : " + link.attr("href")); 
        }
    }
}


// OUTPUT:


// EXPECTED OUTPUT:
   my beutiful link : /sport/golf/66227692\

Solution

  • You are missing second . (dot) in the class selector. This should work:

     Elements links = document.select("a[href].ssrcss-6m4230-PromoLink.e1f5wbog1")
    

    which means: Select a having href attribute and class ssrcss-6m4230-PromoLink and class e1f5wbog1

    For what you have did a[href].ssrcss-6m4230-PromoLink e1f5wbog1 means: select a having href attribute and class ssrcss-6m4230-PromoLink. Then select all of its children at any depth having tag e1f5wbog1

    Makes a difference, no ?? ;)