Search code examples
javahtmljsoup

Java parse data from html table with jsoup


I want to get the data from the table from the link.

link:

https://www.nasdaq.com/symbol/aapl/financials?query=balance-sheet

I´ve tried my code but it doens´t work

public static void main(String[] args) {
    try {
        Document doc = Jsoup.connect("https://www.nasdaq.com/symbol/aapl/financials?query=balance-sheet").get();
        Elements trs = doc.select("td_genTable");



        for (Element tr : trs) {
            Elements tds = tr.getElementsByTag("td");
            Element td = tds.first();
            System.out.println(td.text());
        }
    } catch (IOException e) {
        e.printStackTrace();
    }
}

Can anybody help me? To get it to work

I´m not getting an output of the table. Nothing happens.


Solution

  • After test your code I've got and Read time out problem. Looking on Google I found this post where suggest to add an user agent to fix it and it worked for me. So, you can try this

    public static void main(String[] args) {
        try {
            // add user agent
            Document doc = Jsoup.connect("https://www.nasdaq.com/symbol/aapl/financials?query=balance-sheet")
                    .userAgent("Mozilla/5.0").get();
            Elements trs = doc.select("tr");
            for (Element tr : trs) {
                Elements tds = tr.select(".td_genTable");
                // avoid tr headers that produces NullPointerException
                if(tds.size() == 0) continue;
                // look for siblings (see the html structure of the web)
                Element td = tds.first().siblingElements().first();
                System.out.println(td.text());
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
    

    I have added User agent option and fix some query errors. This will be useful to start your work ;)