Skip to main content

Celonis Product Documentation

General Overview


Supported Database Types

The database or JDBC connector allows you to connect to any SQL database via JDBC. Currently, the following databases are supported directly. You can also connect to any other database by supplying the JDBC driverJ.

  • Amazon Athena

  • Amazon Redshift

  • Azure SQL

  • Azure Synapse

  • Cloudera Impala

  • Google BigQuery

  • HANA (encrypted or unencrypted)

  • Hive

  • IBM DB2

  • Intersystems Cache


  • MySQL

  • Netezza

  • OpenEdge

  • Oracle

  • Postgres (encrypted or unencrypted)

  • SAP MaxDB

  • Snowflake

  • Sybase

  • Teradata

  • Trino

Separate drivers

Please note that for legal reasons we cannot provide JDBC drivers for certain database types when you use them as an on-premise Extractor. Currently, you need to provide the driver for the following database types when you use it locally:

  • Amazon Redshift

  • Cloudera Impala

  • Google BigQuery

  • HANA

  • Hive

  • MySQL

  • SAP MaxDB

  • Sybase

  • Teradata

You need to specify the driver in the following way when running the Extractor via the command line:

java -Dloader.path=<path_to_driver> -jar <connector_file_name>.jar

And when running the extractor as a service you need to change the arguments line in the CelonisJDBCExtractor.xml file as follows:

<arguments>"%BASE%\temp"-Dloader.path=<path_to_driver> -jar connector-jdbc.jar</arguments>


  • Can I extract data via ODBC interface?ODBC is open database connectivity, used between applications. JDBC is Java database connectivity, used by Java developers to connect to databases. Basically, the JDBC extractor uses ODBC via a Java program. Using a JDBC-ODBC Bridge program, ODBC-accessible databases are accessible via JDBC interface.

Connection Options

There are two scenarios:

A) You do not want to or cannot allow the EMS to access your database directly and you want to use an on premise Extractor instead.

B) You want to allow the EMS to access your database directly.

Extraction Flow
Security 101
  1. What protocol is used for the communication between the database and the EMS?

    The extractor server connects to the EMS for data extraction (full load or delta load) HTTPS encrypted via TLS 1.2, port 443.

  2. For which databases is encryption in-transit enabled by default and for which does it need to be enabled via additional parameters?The in-transit encryption is in general handled by the JDBC driver, so this normally is specified in the JDBC driver documentation. For example for HANA or Postgres there are separate connection templates (e.g. HANA Encrypted) which enforce the encryption by using the respective driver automatically.

    For most databases, this can also be activated by adding an additional JDBC connection parameter (mostly encrypt=true) into the additional properties of the data connection.

    For some database types (such as Snowflake) the in-transit encryption is automatically enforced by the database server and doesn't need to be configured.

  3. Which protocol is used for encryption?

    TLS 1.2

  4. How is pseudonymization handled?

    Pseudonymization is handled by the extractor before the data is written to parquet files and inserted into the EMS. So for direct database connections this happens at the moment the files are sent to the cloud and for uplinked connection this happens on-premise on the extractor server.

  5. How and where are the username and password stored?

    Username and password (and everything you see in the data connection form for any extractors) is:

    • converted to a byte array ({username: 'celonis'} becomes 7b 75 73 65 72 6e 61 6d 65 3a 20 27 66 6c 6f 72 69 61 6e 27 7d

    • encrypted with a tenant specific id