Exporting data from Google Cloud Storage to Amazon S3

You can use gsutil to copy data from a Google Cloud Storage bucket to an Amazon bucket, using a command such as:

gsutil -m rsync -rd gs://your-gcs-bucket s3://your-s3-bucket

Note that the -d option above will cause gsutil rsync to delete objects from your S3 bucket that aren't present in your GCS bucket (in addition to adding new objects). You can leave off that option if you just want to add new objects from your GCS to your S3 bucket.

Go to any instance or cloud shell in GCP

First of all configure your AWS credentials in your GCP

aws configure

if this is not recognising the install AWS CLI follow this guide https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-install.html

follow this URL for AWS configure https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html

Attaching my screenshot

enter image description here

Then using gsutil

gsutil -m rsync -rd gs://storagename s3://bucketname

enter image description here

16GB data transferred in some minutes

Using Rclone (https://rclone.org/).

Rclone is a command line program to sync files and directories to and from

Google Drive
Amazon S3
Openstack Swift / Rackspace cloud files / Memset Memstore
Dropbox
Google Cloud Storage
Amazon Drive
Microsoft OneDrive
Hubic
Backblaze B2
Yandex Disk
SFTP
The local filesystem

Using the gsutil tool we can do a wide range of bucket and object management tasks, including:

Creating and deleting buckets.
Uploading, downloading, and deleting objects.
Listing buckets and objects. Moving, copying, and renaming objects.

we can copy data from a Google Cloud Storage bucket to an amazon s3 bucket using gsutil rsync and gsutil cp operations. whereas

gsutil rsync collects all metadata from the bucket and syncs the data to s3

gsutil -m rsync -r gs://your-gcs-bucket s3://your-s3-bucket

gsutil cp copies the files one by one and as the transfer rate is good it copies 1 GB in 1 minute approximately.

gsutil cp gs://<gcs-bucket> s3://<s3-bucket-name>

if you have a large number of files with high data volume then use this bash script and run it in the background with multiple threads using the screen command in amazon or GCP instance with AWS credentials configured and GCP auth verified.

Before running the script list all the files and redirect to a file and read the file as input in the script to copy the file

gsutil ls gs://<gcs-bucket> > file_list_part.out

Bash script:

#!/bin/bash
echo "start processing" 
input="file_list_part.out"
while IFS= read -r line
do
    command="gsutil cp ${line} s3://<bucket-name>"
    echo "command :: $command :: $now"
    eval $command
    retVal=$?
    if [ $retVal -ne 0 ]; then
        echo "Error copying file"
        exit 1
    fi
    echo "Copy completed successfully"
done < "$input"
echo "completed processing"

execute the Bash script and write the output to a log file to check the progress of completed and failed files.

bash file_copy.sh > /root/logs/file_copy.log 2>&1

Exporting data from Google Cloud Storage to Amazon S3

Tags:

Amazon S3

Google Bigquery

Google Cloud Storage

Related

Recent Posts