ETag vs Header Expires
Etag and Last-modified headers are validators.
They help the browser and/or the cache (reverse proxy) to understand if a file/page, has changed, even if it preserves the same name.
Expires and Cache-control are giving refresh information.
This means that they inform, the browser and the reverse in-between proxies, up to what time or for how long, they may keep the page/file at their cache.
So the question usually is which one validator to use, etag or last-modified, and which refresh infomation header to use, expires or cache-control.
Expires
and Cache-Control
are "strong caching headers"
Last-Modified
and ETag
are "weak caching headers"
First the browser check Expires/Cache-Control
to determine whether or not to make a request to the server
If have to make a request, it will send Last-Modified/ETag
in the HTTP request. If the Etag
value of the document matches that, the server will send a 304 code instead of 200, and no content. The browser will load the contents from its cache.
They are slightly different - the ETag does not have any information that the client can use to determine whether or not to make a request for that file again in the future. If ETag is all it has, it will always have to make a request. However, when the server reads the ETag from the client request, the server can then determine whether to send the file (HTTP 200) or tell the client to just use their local copy (HTTP 304). An ETag is basically just a checksum for a file that semantically changes when the content of the file changes.
The Expires header is used by the client (and proxies/caches) to determine whether or not it even needs to make a request to the server at all. The closer you are to the Expires date, the more likely it is the client (or proxy) will make an HTTP request for that file from the server.
So really what you want to do is use BOTH headers - set the Expires header to a reasonable value based on how often the content changes. Then configure ETags to be sent so that when clients DO send a request to the server, it can more easily determine whether or not to send the file back.
One last note about ETag - if you are using a load-balanced server setup with multiple machines running Apache you will probably want to turn off ETag generation. This is because inodes are used as part of the ETag hash algorithm which will be different between the servers. You can configure Apache to not use inodes as part of the calculation but then you'd want to make sure the timestamps on the files are exactly the same, to ensure the same ETag gets generated for all servers.