Data extractions and transformations for object-centric process mining
Celonis uses extraction and transformation tasks to convert business data into an object-centric data model, delivering a flexible, system-agnostic view of organizational processes. Production environments support both predefined catalog configurations and custom pipelines tailored to proprietary source systems. The extracted and transformed data populates the centralized object-centric data pool.
The following diagram displays the data flow architecture:

Connect: Establishes the secure connection between the source system and the Celonis Platform. Connection methodology depends on the specific source architecture. See: Connecting data sources.
Extract: Executes data jobs (extraction tasks) to pull raw fields from source systems into the Celonis Platform prior to modeling. See: Extractions.
Transform: Processes the staged raw data through transformation tasks to generate unified objects, events, changes, and relationships. See: Transformations.
Store: Commits the newly generated objects, events, changes, and relationships directly to the database layer within the OCPM data pool. See: Object-centric data pool.
Extractions pull relevant business data from your source systems (like SAP ECC, Oracle EBS, or other databases) and prepare it for transformation into the object-centric model. Extractions operate as data collection pipelines that gather the right data without altering it.
Extraction tasks execute the following actions:
Isolate source components: Locate the exact source ERP tables containing required process logs (such as the accounting table
BSEGwithin an SAP ECC environment) and select the specific tracking attributes (such as document ID, vendor, and baseline date) required for ingestion.Ingest raw records: Pull the raw data directly into the Celonis Platform staging area without altering formatting, preserving original system logs completely intact before triggering the modeling layer.
Using predefined extractions or creating custom extractions
Predefined extractions: Use predefined extractions for standard catalog processes such as Accounts Payable, Order Management, or Procurement. These configurations automatically identify the target tables and fields within SAP ECC or Oracle EBS.
Creating custom extractions: Build custom extractions when a source system is not covered by the standard catalog, or when the process requires custom tables, attributes, or proprietary logic. Define the target tables and columns using SQL, then pair them with custom transformations to populate new objects or events.
Transformation tasks convert raw staged data into structured objects, events, changes, and relationships within the object-centric data model to enable cross-process analysis.
Transformation tasks execute the following actions:
Generate process objects: Map raw staged data to specific object types—such as converting individual transactional records into a single, distinct Accounts Payable
Invoiceobject—and route structural state updates into dedicated change tables.Construct process events: Derive discrete operational events from object state changes, automatically generating unified event logs populated with precise execution timestamps, event IDs, and transactional values.
An example of transformations within Objects and Events:

Customizing or extending transformations
Predefined transformations accelerate implementation by covering standard processes. Customize or extend transformation tasks only when adapting to unique business requirements.
Customize transformations: Adjust how the system populates objects or events to accommodate specific operational needs, such as managing missing data fields, defining complex table joins, or remapping system IDs.
Extend transformations: Populate custom attributes or build entirely new object and event types without replacing or modifying the original catalog logic.
The object-centric data pool acts as the centralized repository for all modeled data, containing objects, events, changes, relationships, and transformations. Each object and event type populates a dedicated table, while supporting tables track changes and relationships. By default, a single shared model resides within each data pool to establish a unified workspace and a single source of truth.
The data pool supports isolated development (test) and production (read-only) environments to secure the building and validation of transformations before production deployment. Access controls govern data pool permissions, automatically routing URLs and API calls to the active model configuration.
Deploy a single data pool and model for standard configurations. Enable multiple models only when compliance mandates or organizational structures require strict data isolation. Object-centric and case-centric assets can coexist within the same data pool, but each individual data model must remain entirely unified as a single structural type.