What is the purpose of the expiration time in signed S3 urls?
It is useful in the following scenario:
You want to serve, say, video files, but only to certain users. You want to make it harder for them to share URLs.
Bob signs in to your site, selects a video. You generate and HMAC the URL with the expiry time, set to say 5 seconds. The page loads in 1 second and the embedded video player makes the request to AWS. The URL is still valid.
Bob looks at the HTML source, finds the video URL, copies it and sends it to his friend. The friend opens the URL but it has expired.
The expiration time is there to limit the lifetime of the authorization to perform the allowed action by anyone in possession of the signed URL.
The expiration time in a URL cannot be changed without invalidating the signature, of course, since it's one of the inputs to HMAC, but the expiration time, itself does not contribute anything to the actual security of the signing algorithm, itself because:
even though it does add information to the message being hashed, it does not add any secret information
the fact that a URL has "expired" does not prevent a malicious actor from brute-forcing it, if we assumed brute-forcing were computationally feasible... because you do not need S3 to confirm that you have brute-forced my signing secret. When your signing algorithm produces the same output I did based on the same input, you win. Sure, that specific URL you had is now expired, but now you have my signing secret and can sign any URL you like, with any expiration time you like.
Please note that nothing I am saying here should be taken to imply that I believe the security of the signing algorithm is flawed -- my only point is that the expiration time has no impact on whatever level of security the algorithm would provide if the expiration time were not a component.
Also for clarity, since the expiration time is part of the signed message, the expiration time is tamper-resistant, by definition. A valid URL of mine that expired yesterday can't be tweaked to "expire tomorrow" and have its signature still be valid.
The Signature Version 2 algorithm you've mentioned is very straightforward. Conceptually, you canonicalize the request you're going to make, and then you run it through HMAC. Nothing involved in the request is secret, except for the secret key. The request itself (the input message to the HMAC function), including the expiration time, has no hidden components. It couldn't, because then S3 wouldn't be able to generate the same signature, having had no secret communication from me about the upcoming request -- which is how the signing algorithm works -- for any given request (HTTP verb, Content-MD5 header value if it will be sent, Content-Type header value if it will be sent, expiration time, Canonicalized X-Amz
headers if they will be sent, and finally /${bucket}/${key}{$canonical_query_string}
, all these elements concatenated together with \n
in between.
S3 knows my secret (as, indeed, apparently all AWS services do, so they can validate my requests), and attempts to generate the same signature as I did. If it succeeds, the request is allowed if the user whose credentials signed the request is actually allowed to make the request.
If you are in possession of a signed URL I generated, you can go at it brute force by reproducing the message I would have signed (you already know everything you need, from the URL in order to do that), and at the point when you calculate the same Signature
I gave you in my signed URL, you've cracked my secret key. For any given request and expiration time, there can only be exactly one correct signature.
Good luck with that cracking project, of course, because I will have rotated my key (and its accompanying secret) out of active status and disabled it after a few weeks or months, long before you have a meaningful chance to crack it.
But the point is that, just like every other component that goes into the message I'm HMAC-ing, you see &Expires=
in the URL, and you know what value goes there, in order to reproduce the original message I signed. Expiration does nothing to complicate your task.
No, the expiration is strictly to control the valid lifetime of a signed URL I hand out, on the theory that unauthorized access to that URL takes a period of time to occur. The more sensitive the information, generally speaking, the shorter you should set the request expiration time to.
Side notes: Inclusion of the expiration time also gives the back-end servers a minor optimization, which you can prove for yourself -- expiration time is checked first, before the signature's validity is checked. There is, after all, no point in spending CPU cycles trying to verify a signature that's accompanied by an expiration time that clearly indicates it has expired. S3 appears to short-circuit this unnecessary work by returning the same "Request has expired" error when the expiration time has passed, without regard to whether or not the signature is valid. You can confirm this by manually changing the expiration time of a valid request. If you set it to the past, that invalidates the signature, but the error you get is "Request has expired." If you set it to the future, that also invalidates the signature, but the error you get indicates that the signature is invalid.
Also: Signature Version 4 is a much more complex -- and even more secure -- algorithm than V2, using 5 nested iterations of HMAC-SHA256... and there aren't 5 iterations because of some misguided notion that "more is better."
In truth, it took a while for the implications of this algorithm's design to sink in with me, and it seems like a rather brilliant one.
If you peel back the layers, it becomes apparent that AWS has designed this algorithm to internally (that is, within AWS) delegate trust according to the principle of least privilege.
The innermost HMAC iteration is a message consisting of today's date, signed with a secret composed of the string "AWS4" + my secret. That's the DateKey. It's only good for 7 days. The central security repository of IAM is the only entity that knows my secret, and my DateKey.
The next iteration signs the literal name the AWS region (e.g. us-west-2
) using the output of the DateKey, to derive the DateRegionKey. IAM can then deliver this value to its subsystems within each region, such that they know all they need to know in order to validate my signatures for their region, but not globally.
Next, the AWS service name (e.g. s3
) is signed with the DateRegionKey, to generate the DateRegionServiceKey. With each region, the IAM subsystems can generate this secret for each service, delivering to each service all that the service needs to know in order to authenticate my signatures for their service, within that one region -- and (again) nothing more.
Then, the string "aws4_request" is signed with the DateRegionServiceKey, to form my daily signing key. Since "aws4_request" is essentially just a string, each service can derive (rather than store) the signing key I'll be using to sign requests today, thus having only the information that it needs, and nothing more.
Finally, my daily signing key is used to sign each canonicalized request.
Do you see what they did, there?
No system except the IAM core system needs to know my actual secret. If there were an internal breach in, say, an S3 region's infrastructure (however unlikely that is) the information an attacker could obtain would not reveal my actual secret to them -- only the secret that S3 in that region was aware of. If the regional infrastructure were breached, credentials obtained would be useless in other regions, etc., on up the chain to the root.
The V4 algorithm is at the core of what appears to be a secure, globally-distributed, secret signing key delegation system, where everything, internally, is on a need-to-know basis. Like I say... it seems rather brilliant.