Search code examples
javajsoup

JSoup not reading content from URL with anchor


I'm using JSoup to read content from the following page:

https://www.astrology.com/horoscope/daily/aries.html#Monday

This is the code that I'm using:

String test1 = "https://www.astrology.com/horoscope/daily/aries.html#Monday";
String test2 = "https://www.astrology.com/horoscope/daily/aries.html#Tuesday";

Document document = Jsoup.connect(test1).get();
Element content = document.getElementById("content");
Element p = content.child(0);
String myTest = p.text();

In the URL I can pass the day with an anchor (see test1 and test2 variables) but in both cases it returns the same content, looks like it JSoup is simply ignoring the anchor and just using the base URL: https://www.astrology.com/horoscope/daily/aries.html. Is there a way for JSoup to read an URL with an anchor?


Solution

  • Jsoup ignores the anchor because the relevant information is rendered with JavaScript and Jsoup cannot process it. If you examin the page with your browser's dev tools you'll see that the daily info is found in a json file, like https://www.astrology.com/horoscope/daily/all/aries/2021-03-23/, so you can easily change the date/sign and get whatever you like.