Amazon S3 connection
Important
Any references to third-party products or services do not constitute Celonis Product Documentation nor do they create any contractual obligations. This material is for informational purposes only and is subject to change without notice.
Celonis does not warrant the availability, accuracy, reliability, completeness, or usefulness of any information regarding the subject of third-party services or systems.
Step 1: Configure a user with sufficient permissions in S3
Create an Access Key and Secret Access Key using the following steps:
Go to your Account in AWS.
Select My Security Credentials.
Select the Access keys (access key ID and secret access key) section.
Click Create New Access Key. If you select Create New Access Key, download the Key pairs to your system for future use. Click the Show Access key to get your Access Key ID and Secret Access Key.
At a minimum, set these permissions:
s3:GetBucketAcl
s3:GetObject
s3:ListBucket
Here's a JSON example:
{ "Version": "2023-11-30", "Statement": [ { "Sid": "Celonis S3 Extractor", "Effect": "Allow", "Principal": { "AWS": "<THE ARN OF YOUR IAM USER>" }, "Action": [ "s3:GetBucketAcl", "s3:GetObject", "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::<YOUR BUCKET>", "arn:aws:s3:::<YOUR BUCKET>/*" ] } ] }
Note
When writing files to the S3 bucket from an external application, the following permissions also need to be enabled: "ACL": "bucket-owner-full-control
Step 2: Allowlist EMS IP addresses
Note
Follow this step only if your system is IP-blocked.
If your system is only reachable within a specific IP range, you need to allowlist the outbound IPs of the EMS; otherwise, the data cannot be extracted. The EMS IPs differ depending on the cluster (eu-1 or us-1). For a list of clusters and their IP addresses, see Allowlisting Celonis IP addresses and third party domains.
Step 3: Create a Data Connection in a Data Pool
Enter the name of the new S3 Data Connection.
Enter the AWS region in which your bucket is hosted.
Choose the bucket from which you want to extract.
Enter the Access Key ID.
Specify the Access Key Secret.
Table schema
There are two ways to get your files in the Amazon S3 bucket translated into a table structure:
Add specific files to an extraction. In this case, each file will result in a corresponding table with the table name matching the file name.
Add a complete folder to an extraction. In this case, all files in the folder are combined into one table, with the table name matching the folder name. This method requires all files in the folder to conform to the same schema and file type.
Filtering and delta loads
Filtering is not supported for S3 extraction. Therefore, there are no delta filters.
You can execute your extraction as full loads or as delta loads. For full loads, the extraction will replace existing tables. For delta loads, it will append the records to the existing tables.
Data access
The Celonis Extractor performs read-only operations on your S3 bucket. No writing changes (such as updates and deletions) will be performed during the extraction process.
Security
Transfer of the data from S3 to the target system is secured through HTTPS, which allows for an encrypted exchange of information.
API
To send requests to S3 and receive a response and data, the Amazon S3 SDK is used: https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/AmazonS3Client.html
Notes and troubleshooting
Supported file types are CSV, JSON, and Parquet.
When extracting folders from an S3 bucket, ensure that the files' metadata stays consistent.