Skip to main content

Celonis Product Documentation

On-premise extractors

This section has general information for uplink-based on-premise extractors. We have on-premise extractors available for SAP, and for database connections using JDBC. If you need more than one extractor type, it's fine to install multiple on-premise extractors on the same system.

There are the following on-premise extractors in Celonis Platform:

  • JDBC Extractor for database connection

  • SAP Extractor


Unless you're using an older version of SAP, 4.6C or earlier or you use PO/PI, we recommend that you use our newer on-prem clients. For the instructions to install this client, see Installing. The on-prem clients run mostly in the Celonis Platform cloud, so you'll get these benefits compared to the uplink-based on-premise extractors:

  • Cloud updates - Most of the orchestration and processing have been rewritten to run in the cloud. This significantly simplifies the distribution of updates over the cloud deployments. Upgrading of the on-premise component will be required only in exceptional cases, if at all.

  • Improved reliability and scalability - Transfer of the processing to the cloud takes the load off the on-premise agent and allows us to handle more data with less on-premise capabilities.

  • Simplified setup - There's a user-friendly GUI to monitor and manage the on-prem client installation and services. The same SAP connection can be used both for the SAP Extractor and Process Automation. And the format conversion is done in the the Celonis Platform cloud, so you don't need to install Microsoft Visual C++ 2010 for it.

The connection between our on-premise extractors and the Celonis Platform is always established by the extractor. So the extractor doesn't have to be reachable from the Celonis Platform. Although extraction appears to be triggered from Data Integration, actually an on-premise extractor continuously polls Data Integration (on average every 7-8 seconds) for new extractions to run. So only a one-way connection is required.

These are the tasks that an on-premise extractor does:

  1. Establish the connection to Data Integration.

  2. Poll Data Integration for new extractions to be executed and collect information.

  3. Run the extraction in the source system.

  4. Receive the data and transform the data to parquet format.

  5. Use the Data Push API to push the parquet files to Data Integration.

These are the responsibilities of Data Integration in the process:

  1. Define tables and filters for an extraction.

  2. Run the Data Job with the extraction.

  3. Receive the parquet files and insert them into the database.