Storage Transfer Service can listen to event notifications in AWS or Google Cloud to automatically transfer data that has been added or updated in the source location. Event-driven transfers are supported from AWS S3 or Cloud Storage to Cloud Storage.
Event-driven transfers listen to Amazon S3 Event Notifications sent to Amazon SQS for AWS S3 sources. Cloud Storage sources send notifications to a Pub/Sub subscription.
Benefits of event-driven transfers
Because event-driven transfers listen for changes to the source bucket, updates are copied to the destination in near-real time. Storage Transfer Service doesn't need to execute a list operation against the source, saving time and money.
Use cases include:
Event-driven analytics: Replicate data from AWS to Cloud Storage to perform analytics and processing.
Cloud Storage replication: Enable automatic, asynchronous object replication between Cloud Storage buckets.
Event-driven transfers with Storage Transfer Service differ from typical Cloud Storage replication by creating a copy of your data in a different bucket.
This provides benefits such as:
- Keeping development and production data in separate namespaces.
- Sharing data without providing access to the original bucket.
- Backing up to a different continent, or to an area not covered by dual-region and multi-region storage.
DR/HA setup: Replicate objects from source to backup destination in order of minutes:
- Cross-cloud backup: Create a copy of AWS S3 backup on Cloud Storage.
- Cross-region or cross-project backup: Create a copy of Cloud Storage bucket in a different region or project.
Live migration: Event-driven transfer can power low-downtime migration, on the order of minutes of downtime, as a follow up step to one-time batch migration.
Set up event-driven transfers from Cloud Storage
Event-driven transfers from Cloud Storage use Pub/Sub notifications to know when objects in the source bucket have been modified or added. Object deletions are not detected; deleting an object at the source does not delete the associated object in the destination bucket.
Configure permissions
Find the name of the Storage Transfer Service service agent for your project:
Go to the
googleServiceAccounts.get
reference page.An interactive panel opens, titled Try this method.
In the panel, under Request parameters, enter your project ID. The project you specify here must be the project you're using to manage Storage Transfer Service, which might be different from the source bucket's project.
Click Execute.
Your service agent's email is returned as the value of
accountEmail
. Copy this value.The service agent's email uses the format
project-PROJECT_NUMBER@storage-transfer-service.iam.gserviceaccount.com
.Grant the
Pub/Sub Subscriber
role to the Storage Transfer Service service agent.Cloud console
Follow the instructions in Controlling access through the Google Cloud console to grant the
Pub/Sub Subscriber
role to the Storage Transfer Service service. The role can be granted at the topic, subscription, or project level.gcloud
CLIFollow the instructions in Setting a policy to add the following binding:
{ "role": "roles/pubsub.subscriber", "members": [ "serviceAccount:project-PROJECT_NUMBER@storage-transfer-service.iam.gserviceaccount.com" }
Configure Pub/Sub
Make sure that you've satisfied the Prerequisites for using Pub/Sub with Cloud Storage.
Configure Pub/Sub notification for Cloud Storage:
gcloud storage buckets notifications create gs://BUCKET_NAME --topic=TOPIC_NAME
Create a pull subscription for the topic:
gcloud pubsub subscriptions create SUBSCRIPTION_ID --topic=TOPIC_NAME --ack-deadline=300
Create a transfer job
You can use the REST API or the Google Cloud console to create an event-based transfer job.
Don't include sensitive information such as personally identifiable information (PII) or security data in your transfer job name. Resource names may be propagated to the names of other Google Cloud resources and may be exposed to Google-internal systems outside of your project.
Cloud console
Go to the Create transfer job page in the Google Cloud console.
Select Cloud Storage as both the source and the destination.
As the Scheduling mode select Event-driven and click Next step.
Select the source bucket for this transfer.
In the Event stream section, enter the subscription name:
projects/PROJECT_NAME/subscriptions/SUBSCRIPTION_ID
Optionally, define any filters, then click Next step.
Select the destination bucket for this transfer.
Optionally, enter a start and end time for the transfer. If you don't specify a time, the transfer will start immediately and will run until manually stopped.
Specify any transfer options. More information is available from the Create transfers page.
Click Create.
Once created, the transfer job starts running and an event listener waits for notifications on the Pub/Sub subscription. The job details page shows one operation each hour, and includes details on data transferred for each job.
REST
To create an event-driven transfer using the REST API, send the following JSON object to the transferJobs.create endpoint:
transfer_job { "description": "YOUR DESCRIPTION", "status": "ENABLED", "projectId": "PROJECT_ID", "transferSpec" { "gcsDataSource" { "bucketName": "GCS_SOURCE_NAME" }, "gcsDataSink": { "bucketName": "GCS_SINK_NAME" } } "eventStream" { "name": "projects/PROJECT_NAME/subscriptions/SUBSCRIPTION_ID", "eventStreamStartTime": "2022-12-02T01:00:00 00:00", "eventStreamExpirationTime": "2023-01-31T01:00:00 00:00" } }
The eventStreamStartTime
and eventStreamExpirationTime
are optional.
If the start time is omitted, the transfer starts immediately; if the end
time is omitted, the transfer continues until manually stopped.
Client libraries
Go
To learn how to install and use the client library for Storage Transfer Service, see Storage Transfer Service client libraries. For more information, see the Storage Transfer Service Go API reference documentation.
To authenticate to Storage Transfer Service, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
To learn how to install and use the client library for Storage Transfer Service, see Storage Transfer Service client libraries. For more information, see the Storage Transfer Service Java API reference documentation.
To authenticate to Storage Transfer Service, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Storage Transfer Service, see Storage Transfer Service client libraries. For more information, see the Storage Transfer Service Node.js API reference documentation.
To authenticate to Storage Transfer Service, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Storage Transfer Service, see Storage Transfer Service client libraries. For more information, see the Storage Transfer Service Python API reference documentation.
To authenticate to Storage Transfer Service, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Set up event-driven transfers from AWS S3
Event-driven transfers from AWS S3 use notifications from Amazon Simple Queue Service (SQS) to know when objects in the source bucket have been modified or added. Object deletions are not detected; deleting an object at the source does not delete the associated object in the destination bucket.
Create an SQS queue
In the AWS console, go to the Simple Queue Service page.
Click Create queue.
Enter a Name for this queue.
In the Access policy section, select Advanced. A JSON object is displayed:
{ "Version": "2008-10-17", "Id": "__default_policy_ID", "Statement": [ { "Sid": "__owner_statement", "Effect": "Allow", "Principal": { "AWS": "01234567890" }, "Action": [ "SQS:*" ], "Resource": "arn:aws:sqs:us-west-2:01234567890:test" } ] }
The values of
AWS
andResource
are unique for each project.Copy your specific values of
AWS
andResource
from the displayed JSON into the following JSON snippet:{ "Version": "2012-10-17", "Id": "example-ID", "Statement": [ { "Sid": "example-statement-ID", "Effect": "Allow", "Principal": { "Service": "s3.amazonaws.com" }, "Action": "SQS:SendMessage", "Resource": "RESOURCE", "Condition": { "StringEquals": { "aws:SourceAccount": "AWS" }, "ArnLike": { "aws:SourceArn": "S3_BUCKET_ARN" } } } ] }
The values of the placeholders in the preceding JSON use the following format:
- AWS is a numeric value representing your Amazon Web Services
project. For example,
"aws:SourceAccount": "1234567890"
. - RESOURCE is an Amazon Resource Number (ARN) that identifies
this queue. For example,
"Resource": "arn:aws:sqs:us-west-2:01234567890:test"
. - S3_BUCKET_ARN is an ARN that identifies the source bucket. For
example,
"aws:SourceArn": "arn:aws:s3:::example-aws-bucket"
. You can find a bucket's ARN from the Properties tab of the bucket details page in the AWS console.
- AWS is a numeric value representing your Amazon Web Services
project. For example,
Replace the JSON displayed in the Access policy section with the updated JSON above.
Click Create queue.
Once complete, note the Amazon Resource Name (ARN) of the queue. The ARN has the following format:
arn:aws:sqs:us-east-1:1234567890:event-queue"
Enable notifications on your S3 bucket
In the AWS console, go to the S3 page.
In the Buckets list, select your source bucket.
Select the Properties tab.
In the Event notifications section, click Create event notification.
Specify a name for this event.
In the Event types section, select All object create events.
As the Destination select SQS queue and select the queue you created for this transfer.
Click Save changes.
Configure permissions
Follow the instructions in Configure access to a source: Amazon S3 to create either an access key ID and secret key, or a Federated Identity role.
Replace the custom permissions JSON with the following:
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "sqs:DeleteMessage", "sqs:ChangeMessageVisibility", "sqs:ReceiveMessage", "s3:GetObject", "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::AWS_BUCKET_NAME", "arn:aws:s3:::AWS_BUCKET_NAME/*", "AWS_QUEUE_ARN" ] } ] }
Once created, note the following information:
- For a user, note the access key ID and secret key.
- For a Federated Identity role, note the Amazon Resource Name (ARN),
which has the format
arn:aws:iam::AWS_ACCOUNT:role/ROLE_NAME
.
Create a transfer job
You can use the REST API or the Google Cloud console to create an event-based transfer job.
Cloud console
Go to the Create transfer job page in the Google Cloud console.
Select Amazon S3 as the source type, and Cloud Storage as the destination.
As the Scheduling mode select Event-driven and click Next step.
Enter your S3 bucket name. The bucket name is the name as it appears in the AWS Management Console. For example,
my-aws-bucket
.Select your authentication method and enter the requested information, which you created and noted in the previous section.
Enter the Amazon SQS queue ARN that you created earlier. It uses the following format:
arn:aws:sqs:us-east-1:1234567890:event-queue"
Optionally, define any filters, then click Next step.
Select the destination Cloud Storage bucket and, optionally, path.
Optionally, enter a start and end time for the transfer. If you don't specify a time, the transfer will start immediately and will run until manually stopped.
Specify any transfer options. More information is available from the Create transfers page.
Click Create.
Once created, the transfer job starts running and an event listener waits for notifications on the SQS queue. The job details page shows one operation each hour, and includes details on data transferred for each job.
REST
To create an event-driven transfer using the REST API, send the following JSON object to the transferJobs.create endpoint:
transfer_job { "description": "YOUR DESCRIPTION", "status": "ENABLED", "projectId": "PROJECT_ID", "transferSpec" { "awsS3DataSource" { "bucketName": "AWS_SOURCE_NAME", "roleArn": "arn:aws:iam::1234567891011:role/role_for_federated_auth" }, "gcsDataSink": { "bucketName": "GCS_SINK_NAME" } } "eventStream" { "name": "arn:aws:sqs:us-east-1:1234567891011:s3-notification-queue", "eventStreamStartTime": "2022-12-02T01:00:00 00:00", "eventStreamExpirationTime": "2023-01-31T01:00:00 00:00" } }
The eventStreamStartTime
and eventStreamExpirationTime
are optional.
If the start time is omitted, the transfer starts immediately; if the end
time is omitted, the transfer continues until manually stopped.
Client libraries
Go
To learn how to install and use the client library for Storage Transfer Service, see Storage Transfer Service client libraries. For more information, see the Storage Transfer Service Go API reference documentation.
To authenticate to Storage Transfer Service, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
To learn how to install and use the client library for Storage Transfer Service, see Storage Transfer Service client libraries. For more information, see the Storage Transfer Service Java API reference documentation.
To authenticate to Storage Transfer Service, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Storage Transfer Service, see Storage Transfer Service client libraries. For more information, see the Storage Transfer Service Node.js API reference documentation.
To authenticate to Storage Transfer Service, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Storage Transfer Service, see Storage Transfer Service client libraries. For more information, see the Storage Transfer Service Python API reference documentation.
To authenticate to Storage Transfer Service, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.