On-premise extractors

This section has general information for uplink-based on-premise extractors. We have on-premise extractors available for SAP, and for database connections using JDBC. If you need more than one extractor type, it's fine to install multiple on-premise extractors on the same system.

SAP on-premise extractors will be deprecated by the end of December 2026

To ensure continuity of data flow, all customers using SAP on-premise extractors (OPE) must migrate to more modern On-Premise Client (OPC) by the end of December 2026. For more information about the deprecation of OPE, see BREAKING DATA INTEGRATION Deprecating SAP on-prem extractor: Action required before December 31, 2026.

The on-prem clients run mostly in the Celonis Platform cloud, so you'll get these benefits compared to the uplink-based on-premise extractors:

Cloud updates - Most of the orchestration and processing have been rewritten to run in the cloud. This significantly simplifies the distribution of updates over the cloud deployments. Upgrading of the on-premise component will be required only in exceptional cases, if at all.
Improved reliability and scalability - Transfer of the processing to the cloud takes the load off the on-premise agent and allows us to handle more data with less on-premise capabilities.
Simplified setup - There's a user-friendly GUI to monitor and manage the on-prem client installation and services. The same SAP connection can be used both for the SAP Extractor and Process Automation. And the format conversion is done in the Celonis Platform cloud, so you don't need to install Microsoft Visual C++ 2010 for it.

For the instructions to install OPC client, see Installing.

There are the following on-premise extractors in Celonis Platform:

JDBC Extractor for database connection
SAP Extractor

The connection between our on-premise extractors and the Celonis Platform is always established by the extractor. So the extractor doesn't have to be reachable from the Celonis Platform. Although extraction appears to be triggered from Data Integration, actually an on-premise extractor continuously polls Data Integration (on average every 7-8 seconds) for new extractions to run. So only a one-way connection is required.

These are the tasks that an on-premise extractor does:

Establish the connection to Data Integration.
Poll Data Integration for new extractions to be executed and collect information.
Run the extraction in the source system.
Receive the data and transform the data to parquet format.
Use the Data Push API to push the parquet files to Data Integration.

These are the responsibilities of Data Integration in the process:

Define tables and filters for an extraction.
Run the Data Job with the extraction.
Receive the parquet files and insert them into the database.

On-premise extractors

Search results