Skip to main content

Connecting to Amazon S3 (extractor)

The Celonis Amazon S3 extractor lets you bring data stored in Amazon S3 into the Celonis Platform for process mining and analysis. It supports the following basic features:

Prerequisites

This section details important prerequisites or prerequisite knowledge for using this extractor.

When connecting to Amazon S3, note the following points:

  • Table schema: There are two ways to get your files in the Amazon S3 bucket translated into a table structure:

    • Add specific files to an extraction. In this case, each file will result in a corresponding table with the table name matching the file name.

    • Add a complete folder to an extraction. In this case, all files in the folder are combined into one table, with the table name matching the folder name. This method requires all files in the folder to conform to the same schema and file type.

  • Filtering and delta loads: Filtering is not supported for S3 extraction. Therefore, there are no delta filters. You can execute your extraction as full loads or as delta loads. For full loads, the extraction will replace existing tables. For delta loads, it will append the records to the existing tables.

  • Data access: The Celonis Extractor performs read-only operations on your S3 bucket. No writing changes (such as updates and deletions) will be performed during the extraction process.

  • Security: Transfer of the data from S3 to the target system is secured through HTTPS, which allows for an encrypted exchange of information.

  • Supported file types: CSV, JSON, and Parquet.

The Amazon S3 REST API requires you to provide an access key ID and a secret access key to connect to the Celonis Platform.

You can create both access key ID and the secret access key in the My Security Credentials of your Amazon S3 instance. When creating these assets, you should assign the following permissions:

  • s3:GetBucketAcl

  • s3:GetObject

  • s3:ListBucket

  • ACL: bucket-owner-full-control (if writing files to the S3 bucket from an external location)

A JSON example of these permissions is:

{
    "Version": "2023-11-30",
    "Statement": [
        {
            "Sid": "Celonis S3 Extractor",
            "Effect": "Allow",
            "Principal": {
                "AWS": "<THE ARN OF YOUR IAM USER>"
            },
            "Action": [
                "s3:GetBucketAcl",
                "s3:GetObject",
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::<YOUR BUCKET>",
                "arn:aws:s3:::<YOUR BUCKET>/*"
            ]
        }
    ]
}

For a full list of available endpoints for Amazon S3, see: Amazon S3 API Reference.

Configuring the Amazon S3 extractor

This section describes the basic setup of configuring the Amazon S3 extractor. To configure the extractor:

Note

For configuration, the Amazon S3 extractor has specific requirements for the access key. For more information, see Amazon S3 authentication methods.