Amazon S3 connection
Steps to set up a S3 data connection
Step 1: Configure a user with sufficient permissions in S3
An Access Key and Access Secret can be created in the following way:
Go to your Account in AWS
Select My Security Credentials
Select Access keys (access key ID and secret access key) section.
Click on Create New Access Key:If you select Create New Access Key, download the Key pairs to your system for future use. Click on Show Access key and you will get your Access Key ID and Secret Access Key
At minimum the following bucket-guideline needs to be set up on bucket level to access the respective data:
{ "Version": "2008-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Service": "appflow.amazonaws.com" }, "Action": [ "s3:GetBucketAcl", "s3:GetObject", "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::", "arn:aws:s3:::/*" ] } ] }
Note
When writing files to the S3 bucket from an external application, also the following permissions need to be enabled: "ACL": "bucket-owner-full-control
Step 2: Allowlist EMS IP addresses
Note
Follow this step only if your system is IP-blocked.
If your system is only reachable within a certain IP range, you need to allowlist the outbound IPs of the EMS, otherwise data cannot be extracted. The IPs of the EMS are different depending on the cluster (eu-1 or us-1). For the list of clusters and their IP addresses, see Allowlisting Celonis IP addresses.
Step 3: Create a Data Connection in a Data Pool
Step-by-step
Enter the name of the new S3 Data Connection.
Enter the AWS region on which your bucket is hosted
Choose the bucket from which you want to extract
Enter the Access Key ID
Specify the Access Key Secret
Data Access
The Celonis Extractor performs read-only operations on your S3 bucket. No writing changes (like updates, and deletions) will be performed at any time during the extraction process.
Security
Transfer of the data from S3 to the target system is secured through HTTPS, which allows for an encrypted exchange of information.
Used API
In order to send requests to S3 and receive a response and data, the Amazon S3 SDK is used: https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/AmazonS3Client.html
Notes and troubleshooting
Supported file types are csv, json, parquet.
Currently, only full loads are supported.
When extracting folders from an S3 bucket it needs to be ensured that the metadata of the files in the folder stays consistent.