Search code examples
javajsoup

JSoup: Difficulty extracting a single element


For my college coding project, I am tasked with grabbing the live value of bitcoin from the internet and incorporating it into a mini "bitcoin program." The issue is that I am having difficulty extracting the value of bitcoin from certain websites. Any and all help would be greatly appreciated.

I have tried using different websites, with mixed results.

Example 1

    final String url = "https://www.coindesk.com/price/bitcoin";
    try
    {
        Document doc = Jsoup.connect(url).get();
        Element ele = doc.select("span.currency-price").first();
        final String words = ele.text();
        System.out.println(words);
    }
    catch(Exception ex)
    {
        ex.printStackTrace();
    }

Example 2

    final String url = "https://cointelegraph.com/bitcoin-price-index";
    try
    {
        Document doc = Jsoup.connect(url).get();
        Element ele = doc.select("div.price-value").first();
        final String words = ele.text();
        System.out.println(words);
    }
    catch(Exception ex)
    {
        ex.printStackTrace();
    }

Example 1 resulted in a java.lang.NullPointerException at com.mycompany.test.Test.main(Test.java:28)

Example 2 ran without fault.


Solution

  • Site https://www.coindesk.com/price/bitcoin relies heavily on JavaScript when presenting content. Jsoup can't execute JavaScript. It can only parse raw HTML documents.
    To see what Jsoup sees try to visit this page with JavaScript disabled. You'll see the page is missing main content. Alternatively visit this page and press Ctrl+U to check page source before JavaScript modifications.
    Using Chrome's debugger (Network tab) you can see it makes additional AJAX requests to get current exchange rates in JSON from this URL: https://production.api.coindesk.com/v1/exchangeRates
    Then JavaScript is used to create dynamic HTML elements for this data. It also requests few other URLs to fetch graph data.