Search code examples
androidweb-scrapingcookieswebviewjsoup

Jsoup cookie authentication from cookiesyncmanager to scrape from https site


I have an android application using a webview on which the user has to log in with username and password before being redirected to the page i would like to scrape data off with jsoup. Since the jsoup thread would be a different session the user would have to login again.

Now i would like to use the cookie received from the webview to send with the jsoup request to be able to scrape my data.

The cookie is being synced with cookiesyncmanager with following code. This is basically where I am stuck cause i dont know how to read out the cookie nor how to attach it to the jsoup request. Please help? :)

public void onPageFinished(WebView view, String url) {

            CookieSyncManager.getInstance().sync();

The jsoup scrape I am doing after the user has logged in with something like this:

  doc = Jsoup.connect("https://need.authentication.com").get();

                Elements elements = doc.select("span.tabCount");
              Element count = elements.first();


                Log.d(TAG, "test"+(count));

Solution

  • I'm not an android developer but maybe you can try something like this:

    final String url = "https://need.authentication.com";
    
    
    // -- Android Cookie part here --
    CookieSyncManager.getInstance().sync();
    CookieManager cm = CookieManager.getInstance();
    
    String cookie = cm.getCookie(url); // returns cookie for url
    
    // ...
    
    // -- JSoup part here --
    // Jsoup uses cookies as "name/value pairs"
    doc = Jsoup.connect("https://need.authentication.com").header("Cookie", cookie).get();
    
    // ...
    

    I hope this helps a bit, but as i said before: im no android developer (and code isn't tested!)

    Here's some documentation: