I want to read js file as string from url https://d3c3cq33003psk.cloudfront.net/opentag-67008-473432.js
I tried several ways (to read from url or to download and then read), but all the time I received unreadable characters, like �(��_�s��d������:`���.����i�....
The ways I tried it:
1. dowload file from url:
FileUtils.copyURLToFile(jsUrl, file);
2. read from url:
StringBuilder sb = new StringBuilder();
try {
URL url = new URL(jsUrl);
// read text returned by server
BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream(), "UTF-8"));
String line;
while ((line = in.readLine()) != null) {
sb.append(line).append("\n");
}
in.close();
} catch (Exception e) {
}
return sb.toString();
If I download the file manually from the url (page-> save as...) - it could be opened with Notepad++ in normal UTF-8 encoding.
Could anybody help me to handle the weird file?
It's GZIPped. Use a GZIPInputStream
.
UPDATE
InputStream stream = url.openStream();
if ("gzip".equalsIgnoreCase(cnt.getHeaderField("Content-Encoding"))) {
stream = new GZIPInputStream(stream);
}
BufferedReader in = new BufferedReader(new InputStreamReader(stream, "UTF-8"));
UPDATE 2
With URLConnection:
URLConnection cnt = url.openConnection();
InputStream stream = cnt.getInputStream();
if ("gzip".equalsIgnoreCase(cnt.getHeaderField("Content-Encoding"))) {
stream = new GZIPInputStream(stream);
}
BufferedReader read = new BufferedReader(new InputStreamReader(stream, "UTF-8"));