Search code examples

Jsoup meta refresh redirect

I want to get an HTML page from a meta refresh redirect very similar as in question can jsoup handle meta refresh redirect.

But I can't get it to work. I want to do a search on I have the following code:

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;

public class SynchronkarteiScraper {
  public static void main(String[] args) throws Exception{
    Document doc = Jsoup.connect("")
                                        .data("cat", "2")
                                        .data("search", "Thomas Danneberg")
                                        .data("action", "search")
    Elements meta ="html head meta");                                  
    for (final Element m : meta){
      if (m.attr("http-equiv").contains("refresh")){
        doc = Jsoup.connect(m.baseUri()+m.attr("content").split("=")[1]).get();


This does the search, which leads to a temporary site that gets refreshed opens the real result page. It is the same as going to, selecting "Sprecher" from the dropdownbox, entering "Thomas Danneberg" to the textfield and hitting enter.

But even after extracting the refresh URL and do a second connect, I still get the content of the temporary landing page, which can be seen in the prinln of the body.

So what is going wrong here?

As a note, the site always redirects to HTTPS. And since it is using a certificate from StartCom, java complains about the certificate path. To let the above code snippet work, it is necessary to use the VM parameter<path-to-keystore> with the correct certificate.


  • I have to admit, that I am no expert in Jsoup, but I know some details about the Synchronkartei, though.

    Deutsche Synchronkartei supports OpenSearchDescriptions, which is linked at /search.xml. That said, you could also use{searchTerms} to get your search term into the session.

    All you need is a cookie "sid" with the session ID, the Synchronkartei provides you. After that, a direct request to will provide you the results, regardless of your referrer.

    What I mean is, first send a request to{searchTerms} or{Category}&search={searchTerms}&action=search (as you did above) and ignore the result completely if it has an HTTP result of 200, but safe the session cookie. After that, you place a request to which should provide you the whole list of results then.
