Will the same JavaScript fetched by HTTP and HTTPS be cached separately by the browser?
Resources are cached by their URL, and the protocol (http://
or https://
) is part of the URL. Since the protocol differs, the URL must also differ, and you have two separate cache entries.
It is perfectly fine if a http://
and a https://
resource provide different data, even if everything but the access method is the same. For example access to http://
will today often result in a redirect response while access to https://
provide the real content. A browser will therefore cache these resources independent from each other.
Summary:
- The primary cache key for any standards-compliant browser is an absolute URI
- The absolute URI begins
http:
for all insecure requests andhttps:
for all secure requests - Consequently, a resource fetched securely can never use the same cache key as a resource fetched insecurely
The current standard for HTTP is split across multiple "RFC" documents, with RFC 7234 dedicated entirely to caching, because there is a lot of complexity involved.
In section 2, "Overview of Cache Operation", there is this summary:
The primary cache key consists of the request method and target URI. However, since HTTP caches in common use today are typically limited to caching responses to GET, many caches simply decline other methods and use only the URI as the primary cache key.
This is more formally stated in the first bullet point in section 4, which says:
When presented with a request, a cache MUST NOT reuse a stored response, unless [...] the presented effective request URI (Section 5.5 of RFC7230) that of the stored response match [...]
Section 5.5 of RFC 7230 starts by saying
For a user agent, the effective request URI is the target URI.
A browser is a "user agent", so this is the case we're concerned with here. "Target URI" is defined in section 5.1:
A URI reference (Section 2.7) is typically used as an identifier for the "target resource", which a user agent would resolve to its absolute form in order to obtain the "target URI". The target URI excludes the reference's fragment component, if any, since fragment identifiers are reserved for client-side processing (RFC3986, Section 3.5).
The generic definition of a URI is in RFC 3986, and HTTP-specific concerns take up three pages of RFC 7230. The most relevant part for our purposes is RFC 3986 section 4.1 which defines this grammar for Absolute URIs:
absolute-URI = scheme ":" hier-part [ "?" query ]
Crucially, note that scheme
is a mandatory part of any Absolute URI. Since HTTP URIs always use the scheme http
and HTTPS URIs always use the scheme https
, this means that their absolute URIs, and thus their "primary cache keys" in a browser, can never collide.
Other answers have mentioned ports. RFC 7230, Section 2.7.1 defines http
URIs as including an "authority" section, which is defined in [RFC 3986, Section 3.2]:
authority = [ userinfo "@" ] host [ ":" port ]
The port is optional, with RFC 7230, Section 2.7.1 defining the default for the http
URI Scheme:
If the port subcomponent is empty or not given, TCP port 80 (the reserved port for WWW services) is the default.
And the following section defining the default for "https":
All of the requirements listed above for the "http" scheme are also requirements for the "https" scheme, except that TCP port 443 is the default if the port subcomponent is empty or not given, and ...
It then follows that:
- Any HTTP request not on port 80 must include a port number in its absolute URI
- Any HTTPS request not on port 443 must include a port number in its absolute URI
- No two requests with different port numbers specified will have the same cache key, since they will have distinct absolute URIs
Thus these URIs would all be cached separately:
- http://example.com/some/resource (default port 80)
- https://example.com/some/resource (default port 443)
- http://example.com:8000/some/resource (non-default port)
- https://example.com:8000/some/resource (non-default port)
- http://example.com:443/some/resource (insecure request on port normally used for HTTPS)
- https://example.com:80/some/resource (secure request on port normally used for plain HTTP)
The only thing I'm not clear on is whether the browser should, may, or must normalise URIs which explicitly mention the port which would be the default anyway. In other words, whether these two URIs would be cached separately or not:
- http://example.com/some/resource
- http://example.com:80/some/resource
I can't think of any practical consequence of normalising these to the same cache key, because by the definitions above they are guaranteed to represent the same resource.