Search code examples
javac#.netweb-scrapingjsoup

unable to get results from jsoup while giving post request


This is the code snippet , it always returns error page

    try {
        String url = "http://kepler.sos.ca.gov/";
        Connection.Response response = Jsoup.connect(url)
                .method(Connection.Method.GET)
                .execute();

        Document responseDocument = response.parse();

        Element eventValidation = responseDocument.select("input[name=__EVENTVALIDATION]").first();
        Element viewState = responseDocument.select("input[name=__VIEWSTATE]").first();
        response = Jsoup.connect(url)
                .data("__VIEWSTATE", viewState.attr("value"))
                .data("__EVENTVALIDATION", eventValidation.attr("value"))
                .data("ctl00_content_placeholder_body_BusinessSearch1_TextBox_NameSearch", "escrow")  // <- search 
                .data("ctl00_content_placeholder_body_BusinessSearch1_RadioButtonList_SearchType", "Corporation Name")
                .data("ctl00_content_placeholder_body_BusinessSearch1_Button_Search", "Search")

                .method(Connection.Method.POST)
                .followRedirects(true)
                .execute();
        Document document = response.parse(); //search results
        System.out.println(document);

    } catch (IOException e) {
        e.printStackTrace();
    }

I got the request response from net panel of firebug and sent the same. Am I missing something?


Solution

  • Depending on your android version, that code will give a "NetworkOnMainThreadExcpetion" if you try to run it directly from a button click or something like that. On honeycomb or later, you have to do network access from a separate explicit thread or a AsyncTask.

    From my debugging, you need to add some cookies. That's included below. Also, a couple of your form fields were missing dollar signs, and there were some blank form fields being passed that were empty but the server might expect, so I included those too.

    For future reference, I recommend the tool Fiddler to debug issues like this if you're not using it already.

    class DownloadFilesTask extends AsyncTask<Void, Integer, Long> {
        protected Long doInBackground(Void... params) {
            long totalSize = 0;
    
            try {
                String url = "http://kepler.sos.ca.gov/";
                Connection.Response response = Jsoup.connect(url)
                        .method(Connection.Method.GET)
                        .execute();
    
                Document responseDocument = response.parse();
                Map<String, String> loginCookies = response.cookies();
    
    
                Element eventValidation = responseDocument.select("input[name=__EVENTVALIDATION]").first();
                String validationKey = eventValidation.attr("value");
    
                Element viewState = responseDocument.select("input[name=__VIEWSTATE]").first();
                String viewStateKey = viewState.attr("value");
    
                response = Jsoup.connect(url)
                        .cookies(loginCookies)
                        .data("__EVENTTARGET", "")
                        .data("__EVENTARGUMENT", "")
                        .data("__LASTFOCUS", "")
                        .data("__VIEWSTATE", viewStateKey)
                        .data("__VIEWSTATEENCRYPTED", "")
                        .data("__EVENTVALIDATION", validationKey)
                        .data("ctl00$content_placeholder_body$BusinessSearch1$TextBox_NameSearch", "aaa")  // <- search
                        .data("ctl00$content_placeholder_body$BusinessSearch1$RadioButtonList_SearchType", "Corporation Name")
                        .data("ctl00$content_placeholder_body$BusinessSearch1$Button_Search", "Search")
    
                        .method(Connection.Method.POST)
                        .followRedirects(true)
                        .execute();
                Document document = response.parse(); //search results
                System.out.println(document);
    
            } catch (IOException e) {
                e.printStackTrace();
            }
    
            return totalSize;
        }
    
        protected void onProgressUpdate(Integer... progress) {
        }
    
        protected void onPostExecute(Long result) {
        }
    }
    

    You would actually execute that code using something like:

    TestAsyncTask t = new TestAsyncTask();
    t.execute();
    

    To get Page 2, you would have to include the following headers. This is pseudocode, obviously, you'd have to convert it to .data calls:

    __EVENTTARGET = ctl00$content_placeholder_body$SearchResults1$GridView_SearchResults_Corp
    __EVENTARGUMENT = Page$2
    

    And you still need the other headers ( __VIEWSTATEENCRYPTED blank, __VIEWSTATE as above) and cookies as above.