I want to get the Content Length
of this file by java:
https://www.subf2m.co/subtitles/farsi_persian-text/SImp4fRrRnBK6j-u2RiPdXSsHSuGVCDLz4XZQLh05FnYmw92n7DZP6KqbHhwp6gfvrxazMManmskHql6va6XEfasUDxGevFRmkWJLjCzsCK50w1lwNajPoMGPTy9ebCC0&name=Q2FwdGFpbiBNYXJ2ZWwgRmFyc2lQZXJzaWFuIGhlYXJpbmcgaW1wYWlyZWQgc3VidGl0bGUgLSBTdWJmMm0gW3N1YmYybS5jb10uemlw
When I insert this url in Firefox
or Google Chrome
, it downloads a file. but when i want to see that file's size by Java HttpsURlConnection
, server returns me Response Code 403
and Content Length -1
. why this happens? Thanks
try {
System.out.println("program started -----------------------------------------");
String str_url = "https://www.subf2m.co/subtitles/farsi_persian-text/SImp4fRrRnBK6j-u2RiPdXSsHSuGVCDLz4XZQLh05FnYmw92n7DZP6KqbHhwp6gfvrxazMManmskHql6va6XEfasUDxGevFRmkWJLjCzsCK50w1lwNajPoMGPTy9ebCC0&name=Q2FwdGFpbiBNYXJ2ZWwgRmFyc2lQZXJzaWFuIGhlYXJpbmcgaW1wYWlyZWQgc3VidGl0bGUgLSBTdWJmMm0gW3N1YmYybS5jb10uemlw";
URL url = new URL(str_url);
HttpsURLConnection con = (HttpsURLConnection) url.openConnection();
con.setConnectTimeout(150000);
con.setReadTimeout(150000);
con.setRequestMethod("HEAD");
con.setInstanceFollowRedirects(false);
con.setRequestProperty("Accept-Encoding", "identity");
con.setRequestProperty("connection", "close");
con.connect();
System.out.println("responseCode: " + con.getResponseCode());
System.out.println("contentLength: " + con.getContentLength());
} catch (IOException e) {
System.out.println("error | " + e.toString());
e.printStackTrace();
}
output:
program started -----------------------------------------
responseCode: 403
contentLength: -1
The default Java user-agent is blocked by some online services (most notably, Cloudflare). You need to set the User-Agent
header to something else.
con.setRequestProperty("User-Agent", "My-User-Agent");
In my experience, it doesn't matter what you set it to, as long as it's not the default one:
con.setRequestProperty("User-Agent", "aaa"); // works perfectly fine
EDIT: looks like this site uses Cloudflare with DDoS protection active - your code won't run the JavaScript challenge needed to actually get the content of the file.