AWS: how to fix S3 event replacing space with '+' sign in object key names in json
I came across this looking for a solution for a lambda written in python instead of java; "urllib.parse.unquote_plus" worked for me, it properly handled a file with both spaces and + signs:
from urllib.parse import unquote_plus
import boto3
bucket = 'testBucket1234'
# uploaded file with name 'foo + bar.txt' for test, s3 Put event passes following encoded object_key
object_key = 'foo %2B bar.txt'
print(object_key)
object_key = unquote_plus(object_key)
print(object_key)
client = boto3.client('s3')
client.get_object(Bucket=bucket, Key=object_key)
What I have done to fix this is
java.net.URLDecoder.decode(b.getS3().getObject().getKey(), "UTF-8")
{
"Records": [
{
"s3": {
"object": {
"key": "New+Text+Document.txt"
}
}
}
]
}
So now the JSon value, "New+Text+Document.txt" gets converted to New Text Document.txt, correctly.
This has fixed my issue, please suggest if this is very correct solution. Will there be any corner case that can break my implementation.