How to use AWS CLI to only copy files in S3 bucket that match a given string pattern
The alternatives that you have listed are the best options because S3 CLI doesn't support regex
.
Use of Exclude and Include Filters:
Currently, there is no support for the use of UNIX style wildcards in a command's path arguments. However, most commands have --exclude "" and --include "" parameters that can achieve the desired result. These parameters perform pattern matching to either exclude or include a particular file or object. The following pattern symbols are supported.
*: Matches everything
?: Matches any single character
[sequence]: Matches any character in sequence
[!sequence]: Matches any character not in sequence
Putting this here for others to find, since I just had to figure this out. Here's what I came up with:
s3cmd del $(s3cmd ls s3://[BUCKET]/ | grep '.*s3://[BUCKET]/[FILENAME]' | cut -c 41-)
You can put the regex in the grep search string. For instance, I was searching for specific files to delete (hence the s3cmd del). My regex looked like: '2016-11-04.*s3.*[DN][RS].*'
. You may have to adjust the cut for your use. Should also work with s3cmd get
.