Search code examples
javahtmlunit

How to download a ZIP file with HTMLUnit clicking on an anchor


I'm trying to download a ZIP file with HTMLUnit 2.32 using the following code.

I obtain a "myfile.zip" bigger than the one downloaded through a normal browser (179kb vs 79kb) and which is corrupt.

How one should click an anchor and download a file with HTMLUnit?

        WebClient wc = new WebClient(BrowserVersion.CHROME);

        final String HREF_SCARICA_CONSOLIDATI = "/web/area-pubblica/quotate?viewId=export_quotate";

        final String CONSOBBase = "http://www.consob.it";

        HtmlPage page = wc.getPage(CONSOBBase + HREF_SCARICA_CONSOLIDATI);

        final String downloadButtonXpath = "//a[contains(@href, 'javascript:downloadAzionariato()')]";
        List<HtmlAnchor> downloadAnchors = page.getByXPath(downloadButtonXpath);
        HtmlAnchor downloadAnchor = downloadAnchors.get(0);

        UnexpectedPage downloadedFile = downloadAnchor.click();

       InputStream contentAsStream = downloadedFile.getWebResponse().getContentAsStream();
        File destFile = new File("/tmp", "myfile.zip");
        Writer out = new OutputStreamWriter(new FileOutputStream(destFile));
        IOUtils.copy(contentAsStream, out);
        out.close();

Solution

  • While RBRi considerations are interesting, I discovered my code worked with HTMLUnit 2.32 with no modifications but I was writing the file the wrong way!

    I used

    Writer out = new OutputStreamWriter(new FileOutputStream(destFile));
    IOUtils.copy(contentAsStream, out);
    

    while it had to be (no OutputStreamWriter)

    FileOutputStream out = new FileOutputStream(destFile);
    IOUtils.copy(contentAsStream, out);