I'd like to know some kind of file checksum (like SHA-256 hash, or anything else) when I start downloading file from HTTP server. It could be transferred as one of HTTP response headers.
HTTP etag
is something similar, but it's used only for invalidating browser cache and, from what I've noticed, every site is calculating it in different way and it doesn't look like any hash I know.
Some software download sites provide various file checksums as separate files to download (for example, latest Ubuntu 16.04 SHA1 hashes: http://releases.ubuntu.com/16.04/SHA1SUMS). Won't it be easier to just include them in HTTP response header and force browser to calculate it when download ends (and do not force user to do it manually).
I guess that whole HTTP-based Internet is working, because we're using TCP protocol, which is reliable and ensures received bytes are exactly same as one send by the server. But if TCP is so "cool", why do we check file hashes manually (see abouve Ubuntu example)? And lot of thing can go wrong during file download (client/server disk corruption, file modification on server side etc.). And if I'm right, everything could be fixed simply by passing file hash at download start.
Non TLS
or indirect
transfer.Maybe I know your doubt because I had the same question about the checksums, let's find it out.
There are two tasks to be considered:
And three protocol in this question:
Now we separate into two situations:
The TCP protocol
promise: the data from server is exactly same as the data client received, by checksum and ack.
The TLS protocol
promise: the server is authenticated (is truly ubuntu.com) and the data is not changed by any middleman.
So there is no need to add checksum header in HTTP protocol when doing HTTPS.
But when TLS is not enabled, forgery could happen: bad guy in middle gives a bad file to the client.
CDN
, by mirror, by offline way(usb disk).Many sites like ubuntu.com use 3-party CDN
to serve static files, which the CDN server is not managed by ubuntu.com.
http://releases.ubuntu.com/somefile.iso
redirect to http://59.80.44.45/somefile.iso
.
Now the checksum must be provided out-of-band because it is not authenticated we don't trust the connection. So checksum header in HTTP protocol is helpless in this situation.