I want to create a little application that download and install/upgrade all my windows software.
But there are more and more annoying javascript systems.
I tried phantomjs, but it can't download.
I just tried htmlunit and it works very well to download or getting original filename.
I can not get it to do both at the same time. My code doesn't work.
package com.example.simpledownloader;
import com.gargoylesoftware.htmlunit.BrowserVersion;
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.HtmlElement;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.logging.Level;
import org.apache.commons.io.FilenameUtils;
public class Main {
public static void main(String[] args) throws Exception {
testDownload();
}
public static void testDownload() throws IOException {
// Turn htmlunit warnings off.
java.util.logging.Logger.getLogger("com.gargoylesoftware").setLevel(Level.OFF);
// Init web client and navigate to the first page.
final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_31);
final HtmlPage page1 = webClient.getPage("http://www.videohelp.com/software/AV-Splitter");
// Get the anchor element.
String xpath1 = "//*[@id=\"Main\"]/div/div/div[11]/table[1]/tbody/tr[3]/td[2]/a[6]";
HtmlElement element = (HtmlElement) page1.getByXPath(xpath1).get(0);
// Extract the original filename from the filepath.
String filepath = element.click().getUrl().getFile();
String filename = FilenameUtils.getName(filepath);
System.out.println(filename);
// Download the file.
InputStream inputStream = element.click().getWebResponse().getContentAsStream();
FileOutputStream outputStream = new FileOutputStream(filename);
int read;
byte[] bytes = new byte[1024];
while ((read = inputStream.read(bytes)) != -1) {
outputStream.write(bytes, 0, read);
}
// Close the webclient.
webClient.close();
}
}
Getting filename works but the downloading doesn't.
I've got this error:
Exception in thread "main" java.lang.RuntimeException: java.io.FileNotFoundException: C:\Users\Admin\AppData\Local\Temp\htmlunit46883917986334906.tmp (The system cannot find the file specified)
It's maybe because I already clicked to get the filename?
Actually, you are clicking twice.
How about:
// Extract the original filename from the filepath.
Page page2 = element.click();
String filepath = page2.getUrl().getFile();
String filename = FilenameUtils.getName(filepath);
System.out.println(filename);
// Download the file.
InputStream inputStream = page2.getWebResponse().getContentAsStream();
FileOutputStream outputStream = new FileOutputStream(filename);
int read;
byte[] bytes = new byte[1024];
while ((read = inputStream.read(bytes)) != -1) {
outputStream.write(bytes, 0, read);
}