Skip to main content

Celonis Product Documentation

Configuration

The sink offers a set of configuration keys alongside the Kafka Connect defaults for convertors, consumer settings, and so on.

Here is the full list:

Key

Description

Type

Required

Default

Version

connect.ems.allow.null.pk

Allow messages with null values in the columns listed as primary keys. If disabled, the connector will fail after receiving such a message.

NOTE: Enabling this may cause data inconsistencies on the Celonis Platform side.

BOOL

NO

false

From 1.7.1

connect.ems.authorization.key

Contains the Celonis Platform API Authorization header. It should be AppKey <<app-key>> or Bearer <<api-key>>.

STRING

YES

null

connect.ems.client.id

An optional parameter representing the client's unique identifier

STRING

NO

null

connect.ems.commit.interval.ms

The time interval in milliseconds to upload the data to Celonis Platform if the other two commit policies are not yet applicable. It cannot be less than 1 second.

LONG

YES

null

connect.ems.commit.records

The maximum number of records in the accumulated file before it is uploaded to Celonis Platform.

INT

YES

null

connect.ems.commit.size.bytes

The accumulated file maximum size before it is uploaded to Celonis Platform. It cannot be less than 100 kb. A file will be uploaded if the other commit policies are triggered. A file smaller than 1MB can still be uploaded if the records count, the time interval, or a schema change comes first.

LONG

YES

System temp directory

connect.ems.connection.id

Optional parameter. It represents the unique Celonis Platform connection identifier.

STRING

NO

null

connect.ems.convert.decimals.to.double

Currently Celonis Platform doesn't support ingestion of decimal in parquet file. This flag enables conversion from decimal to double (float64).

BOOL

NO

false

connect.ems.data.fallback.varchar.length

An optional parameter representing the STRING (VARCHAR) length when the schema is created in Celonis Platform.

Optional parameter representing the STRING (VARCHAR) length when the schema is created in Celonis Platform. This value must be between 1 and 65000.

STRING

NO

null

connect.ems.data.primary.key

Optional parameter to contain a list of comma separated columns which are primary keys for the Celonis Platform table.

If not specified, no primary key will be used, unique will not be enforced and the data will not be deduplicated.

STRING

NO

null

connect.ems.debug.keep.parquet.files

For debugging purposes, set the setting to true for the connector to keep the files after they were uploaded.

BOOL

NO

false

connect.ems.embed.kafka.metadata

Include Kafka metadata fields (kafkaOffset, kafkaPartition, and  kafkaTimestmap) in the target Celonis Platform table.

BOOL

YES

true

connect.ems.endpoint

Contains the Celonis Platform API endpoint in the form of: https://[team].[realm].celonis.cloud/continuous-batch-processing/api/v1/[pool-id]/items

STRING

YES

null

connect.ems.error.policy

Specifies the action to be taken if an error occurs while inserting the data. There are three options:

  • CONTINUE - the error is swallowed

  • THROW - the error is allowed to propagate.

  • RETRY - The exception causes the Connect framework to retry the message. The number of retries is set by connecting.ems.max.retries.

All errors will be logged automatically, even if the code swallows them.

STRING

NO

THROW

connect.ems.explode.mode

When each incoming record is a list of records, this will explode (flatten) the records on output. The possible values are:

  • NONE

  • LIST - the record must be a List of record types. The sink will discard the List wrapper and write each record.

STRING

NO

NONE

connect.ems.flattener.collections.discard

Discard array and map fields. Default behavior is to transform them into JSON-encoded strings.

BOOL

NO

false

connect.ems.flattener.enable

Enable message flattening transformation. This has to be set to true when source topic contains nested data.

BOOL

NO

true

connect.ems.flattener.jsonblob.chunks

The number of string chunks the input record should be JSON encoded into. The byte-size of each JSON-encoded chunk is driven by the connect.ems.data.fallback.varchar.lengthparameter, which needs to be supplied in order for this configuration key to be accepted.

INT

NO

null

connect.ems.inmemfs.enable

Rather than writing to the host file system, buffer parquet data files in memory

BOOL

NO

false

connect.ems.max.retries

The maximum number of times to re-attempt to write the records before the task is marked as failed.

INT

NO

10

connect.ems.obfuscation.fields

An optional value for comma-separated fields to obfuscate. It supports nested values, including arrays.

STRING

NO

null

connect.ems.obfuscation.method

The connector offers 3 types: fix, sha1, and sha512. When a fix is used, the string values are transformed to:***** . For SHA512 a salt is required. See connect.ems.obfuscation.sha512.salt

STRING

NO

fix

connect.ems.obfuscation.sha512.salt

Required only when connect.ems.obfuscation.method is set to sha512 and obfuscation fields have been set. If no obfuscation fields have been provided, this configuration is ignored.

STRING

NO

null

connect.ems.order.field.name

Optional parameter used only when primary keys are set. It needs to be a sortable field, present in the incoming data, to allow record deduplication for records sharing the same primary key(s). For details, see the Primary Key(s) section.

STRING

NO

null

connect.ems.parquet.row.group.size.bytes

The row group size, in bytes, of the rows in the generated parquet files.

INT

NO

1048576

From 1.7.2

connect.ems.row.size.bytes

Constrained to MIN(row.group.size.bytes, commit.size.bytes)

INT

NO

null

 

connect.ems.pool.explicit.close

Connection pool - Explicitly close connections on completion of request.

BOOL

NO

false

connect.ems.pool.keepalive

Connection pool - Number of milliseconds to keep connection alive.

LONG

NO

300000

connect.ems.pool.max.idle

Connection pool - Maximum number of idle connections to allow.

INT

NO

5

connect.ems.proxy.auth.password

The password for proxy authentication, if a proxy is required to access external services.

STRING

NO

null

connect.ems.proxy.auth.type

If a proxy is required to access external services, the type of proxy to use. There is currently one available option:

  • BASIC - Basic Authentication will be used

STRING

NO

null

connect.ems.proxy.auth.username

The username for proxy authentication, if a proxy is required to access external services.

STRING

NO

null

connect.ems.proxy.host

The hostname of the proxy server, if a proxy is required to access external services.

STRING

NO

null

connect.ems.proxy.port

The port number of the proxy server, if a proxy is required to access external services.

INT

NO

null

connect.ems.retry.interval

The interval to wait between retries when using RETRY mode.

LONG

NO

1000

connect.ems.sink.put.timeout.ms

The maximum time (in milliseconds) for the connector task to complete the upload of a single Parquet file before being flagged as failed.

Note

This value should always be lower than max.poll.interval.ms.

LONG

NO

288000

From 1.8.1

connect.ems.target.table

The table in Celonis Platform to store the data.

STRING

YES

null

connect.ems.tmp.dir

The folder stores the temporary files as it accumulates data. If not specified then it uses System.getProperty("java.io.tmpdir").

STRING

NO

System temp directory