Search code examples
javafiledownloadurlconnection

Download File in Java from URL 1)where you don't know the extension[eg .jpg] or 2)is redirecting to a File


The problem is although I know how to download a File from URL , for example :

http://i12.photobucket.com/albums/a206/zxc6/1_zps3e6rjofn.jpg


When it comes to files like the below:

https://images.duckduckgo.com/iu/?u=http%3......

I have no clue how to download it.


The code I am using to download Files with IOUtils it works great if the extension is visible but in the case of the above example returns :

java.io.IOException: Server returned HTTP response code: 500 for URL: https://images.duckduckgo.com/iu/?u=http%3A%2F%2Fimages2.fanpop.com%2Fimage%2Fphotos%2F8900000%2FFirefox-firefox-8967915-1600-1200.jpg&f=1

Even if you remove the &f=1.


Code for Downloader (It is for testing purposes.... a prototype):

import java.awt.event.ActionEvent;
import java.awt.event.ActionListener;
import java.io.File;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.io.OutputStream;
import java.net.URL;
import java.net.URLConnection;

import org.apache.commons.io.IOUtils;

public class Downloader {

    private static class ProgressListener implements ActionListener {

    @Override
    public void actionPerformed(ActionEvent e) {
        // e.getSource() gives you the object of
        // DownloadCountingOutputStream
        // because you set it in the overriden method, afterWrite().
        System.out.println("Downloaded bytes : " + ((DownloadProgressListener) e.getSource()).getByteCount());
    }
    }

    /**
     * Main Method
     * 
     * @param args
     */
    public static void main(String[] args) {
    URL dl = null;
    File fl = null;
    String x = null;
    OutputStream os = null;
    InputStream is = null;
    ProgressListener progressListener = new ProgressListener();
    try {
        fl = new File(System.getProperty("user.home").replace("\\", "/") + "/Desktop/image.jpg");
        dl = new URL(
            "https://images.duckduckgo.com/iu/?u=http%3A%2F%2Fimages2.fanpop.com%2Fimage%2Fphotos%2F8900000%2FFirefox-firefox-8967915-1600-1200.jpg&f=1");
        os = new FileOutputStream(fl);
        is = dl.openStream();

        // http://i12.photobucket.com/albums/a206/zxc6/1_zps3e6rjofn.jpg

        DownloadProgressListener dcount = new DownloadProgressListener(os);
        dcount.setListener(progressListener);

        URLConnection connection = dl.openConnection();

        // this line give you the total length of source stream as a String.
        // you may want to convert to integer and store this value to
        // calculate percentage of the progression.
        System.out.println("Content Length:" + connection.getHeaderField("Content-Length"));
        System.out.println("Content Length with different way:" + connection.getContentType());

        System.out.println("\n");

        // begin transfer by writing to dcount, not os.
        IOUtils.copy(is, dcount);

    } catch (Exception e) {
        System.out.println(e);
    } finally {
        IOUtils.closeQuietly(os);
        IOUtils.closeQuietly(is);
    }
    }
}

Code for DownloadProgressListener:

import java.awt.event.ActionEvent;
import java.awt.event.ActionListener;
import java.io.IOException;
import java.io.OutputStream;

import org.apache.commons.io.output.CountingOutputStream;

public class DownloadProgressListener extends CountingOutputStream {

    private ActionListener listener = null;

    public DownloadProgressListener(OutputStream out) {
    super(out);
    }

    public void setListener(ActionListener listener) {
    this.listener = listener;
    }

    @Override
    protected void afterWrite(int n) throws IOException {
    super.afterWrite(n);
    if (listener != null) {
        listener.actionPerformed(new ActionEvent(this, 0, null));
    }
    }

}

Question I have read before posting:

1)Download file from url that doesn't end with .extension

2)http://www.mkyong.com/java/how-to-get-url-content-in-java/

3)Download file using java apache commons?

4)How to download and save a file from Internet using Java?

5)How to create file object from URL object


Solution

  • As pointed out in the comments, extension is irrelevant.

    The issue here is attempting to download something that's probably a re-direct or maybe just an async call's parameters.

    Your Extra big url without extension is broken, but I can answer a potential solution for the other type.

    If you observe the URL:

    https://images.duckduckgo.com/iu/?u=http%3A%2F%2Fimages2.fan‌​pop.com%2Fimage%2Fph‌​otos%2F8900000%2FFir‌​efox-firefox-8967915‌​-1600-1200.jpg&f=1

    the URL to the image is actually there. It's just encoded and should be pretty easy to decode. There are decoding libraries included in Java (java.net.URLDecoder), but should you wish to do it yourself, you can look at it this way:

    http%3A%2F%2Fimages2.fan‌​pop.com%2Fimage%2Fph‌​otos%2F8900000%2FFir‌​efox-firefox-8967915‌​-1600-1200.jpg&f=1

    The encoded portions are %XX where XX is any two characters. Looking at an HTML encoding table, you'll see %3A is, obviously, a colon. %2F is a forward slash.

    If you replace all the encoded entities, you'll end up with: http://images2.fan‌​pop.com/image/ph‌​otos/8900000/Fir‌​efox-firefox-8967915‌​-1600-1200.jpg&f=1

    In this case, you don't want the extra parameters, so you can discard the &f=1 and download the image from the original URL. In most cases, I imagine you can keep the extra parameter and it'll just be ignored.

    --

    In a nutshell:

    1. Extract the original URL
    2. Decode it
    3. Download

    I'd like to point out this is a fragile solution and will break if the URL pattern changes, or it would require a lot of maintenance. If you're targeting more than a small group of users, you should re-think your approach.

    HTML URL encoding table