Amazon S3 File Permissions, Access Denied when copied from another account
A very interesting conundrum! Fortunately, there is a solution.
First, a recap:
- Bucket A in Account A
- Bucket B in Account B
- User in Account A copies objects to Bucket B (having been granted appropriate permissions to do so)
- Objects in Bucket B still belong to Account A and cannot be accessed by Account B
I managed to reproduce this and can confirm that users in Account B cannot access the file -- not even the root user in Account B!
Fortunately, things can be fixed. The aws s3 cp
command in the AWS Command-Line Interface (CLI) can update permissions on a file when copied to the same name. However, to trigger this, you also have to update something else otherwise you get this error:
This copy request is illegal because it is trying to copy an object to itself without changing the object's metadata, storage class, website redirect location or encryption attributes.
Therefore, the permissions can be updated with this command:
aws s3 cp s3://my-bucket/ s3://my-bucket/ --recursive --acl bucket-owner-full-control --metadata "One=Two"
- Must be run by an Account A user that has access permissions to the objects (eg the user who originally copied the objects to Bucket B)
- The metadata content is unimportant, but needed to force the update
--acl bucket-owner-full-control
will grant permission to Account B so you'll be able to use the objects as normal
End result: A bucket you can use!
To correctly set the appropriate permissions for newly added files, add this bucket policy:
[...]
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::123456789012::user/their-user"
},
"Action": [
"s3:PutObject",
"s3:PutObjectAcl"
],
"Resource": "arn:aws:s3:::my-bucket/*"
}
And set ACL for newly created files in code. Python example:
import boto3
client = boto3.client('s3')
local_file_path = '/home/me/data.csv'
bucket_name = 'my-bucket'
bucket_file_path = 'exports/data.csv'
client.upload_file(
local_file_path,
bucket_name,
bucket_file_path,
ExtraArgs={'ACL':'bucket-owner-full-control'}
)
source: https://medium.com/artificial-industry/how-to-download-files-that-others-put-in-your-aws-s3-bucket-2269e20ed041 (disclaimer: written by me)
aws s3 cp s3://account1/ s3://accountb/ --recursive --acl bucket-owner-full-control
In case anyone trying to do the same but using Hadoop/Spark job instead of AWS CLI.
- Step 1: Grant user in Account A appropriate permissions to copy objects to Bucket B. (mentioned in above answer)
Step 2: Set the fs.s3a.acl.default configuration option using Hadoop Configuration. This can be set in conf file or in program:
Conf File:
<property> <name>fs.s3a.acl.default</name> <description>Set a canned ACL for newly created and copied objects. Value may be Private, PublicRead, PublicReadWrite, AuthenticatedRead, LogDeliveryWrite, BucketOwnerRead, or BucketOwnerFullControl.</description> <value>$chooseOneFromDescription</value> </property>
Programmatically:
spark.sparkContext.hadoopConfiguration.set("fs.s3a.acl.default", "BucketOwnerFullControl")