Checksum in HTTP response header - why not?
Digest
is the standard header used to convey the checksum of a selected representation of a resource (that is, the payload body).
An example response with digest.
>200 OK
>...
>Digest: sha-256=X48E9qOokqqrvdts8nOJRJN3OWDUoyWxBf7kbu9DBPE=
>
>{"hello": "world"}
Digest
may be used both in request and responses.
It's a good practice to validate the data against the digest before processing it.
You can see the related page on mozilla website for an indepth discussion around the payload body in http.
I guess that whole HTTP-based Internet is working, because we're using TCP protocol
No, the integrity on the web is ensured by TLS. Non-TLS communication should not be trusted. See rfc8446
The checksum provided separately from the file is used for integrity check when doing Non TLS
or indirect
transfer.
Maybe I know your doubt because I had the same question about the checksums, let's find it out.
There are two tasks to be considered:
- File broken during transfer
- File be changed by hacker
And three protocol in this question:
- HTTP protocol
- SSL/TLS protocol
- TCP protocol
Now we separate into two situations:
1. File provider and client transfer the file directly, no proxy, no offline(usb disk).
The TCP protocol
promise: the data from server is exactly same as the data client received, by checksum and ack.
The TLS protocol
promise: the server is authenticated (is truly ubuntu.com) and the data is not changed by any middleman.
So there is no need to add checksum header in HTTP protocol when doing HTTPS.
But when TLS is not enabled, forgery could happen: bad guy in middle gives a bad file to the client.
2. File provider and client transfer the file indirectly, by CDN
, by mirror, by offline way(usb disk).
Many sites like ubuntu.com use 3-party CDN
to serve static files, which the CDN server is not managed by ubuntu.com.
http://releases.ubuntu.com/somefile.iso
redirect to http://59.80.44.45/somefile.iso
.
Now the checksum must be provided out-of-band because it is not authenticated we don't trust the connection. So checksum header in HTTP protocol is helpless in this situation.