Skip to main content

Celonis Product Documentation

Set-up of the Replication Data Pool

When the Replication Cockpit is used as a tool to establish real-time connectivity to your SAP systems, there are several different options for how your overall Data Pipeline can be set up and how the Data Pools are organized. This document aims to share our Best Practices for different scenarios that involve the Replication Cockpit and multiple processes.

Why the Extractions of multiple Processes cannot be strictly separated in Data Pools

The Real-Time Extractions in the Replication Cockpit are based on Changelogs in the SAP system. For every execution, we process the information of the changelog and clean the table up once it was successful. This leads to the effect that one table can only be extracted once via the Replication Cockpit. For the scenario, that one table is being extracted via multiple Data Pools, the different replications would be affecting each other negatively - leading to missing data.

→ Our recommendation is to have only one single Data Pool that extracts the data via the Replication Cockpit.

Option 1: One global Data Pool for Extractions and separate Pools for Transformations per Process

→ Use Case: Using Real-Time Extractions but no Real-Time Transformations

Single Extraction Data Pool: In this scenario, we have one single Data Pools that contains all Data Connections as well as all Data Extractions from the source systems. This means the extracted tables for all processes are being combined in this pool which serves as a central storage of all raw data.

Separated Transformation Data Pools: For every single process or use case, there is a separate Data Pool that contains all data transformations and the Data Models built on top.

Export/Import: This option leverages the Data Transfer (Export/Import) functionality on Data Pool level. It allows you to export data from one pool to another in the form of database views that are being created.

Impact:

  • Single Extraction Data Pool allows full support of Real-Time Extractions in the Replication Cockpit

  • Use Cases are still divided in Pools and permissions can be managed on a use case level

  • Real-Time Transformations are not possible as this requires Extractions and Transformations to reside in the same Data Pool

Option 2: One global Data Pool for all Extractions and Transformations

→ Use Case: Using Real-Time Extractions and Real-Time Transformations

Single Global Data Pool: In this scenario, we have one single Data Pool that contains everything, meaning all Data Connections, all Data Extractions and all Data Transformations for all processes and use cases. There is no separation of use cases on Data Pool level anymore. Such distinction can be only achieved through different Data Jobs and Data Models.

Support for Real-Time Transformations: This set-up is the only one that comes with full support for Real-Time Transformations. The transformations in Replication Cockpit are directly coupled to the extractions, which requires both to be executed in the same Data Pool.

Impact:

  • One single Data Pool covering all parts of the Data Pipeline

  • Full Real-Time connectivity to SAP incl. Real-Time Extractions and Real-Time Transformations

  • No clear separation of use cases