Skip to main content

Celonis Product Documentation

RFC Module Description

Explanation of SAP extraction logic

The entry point of the extraction process is the RFC enabled function /CELONIS/EX_NEW. The Extractor service passes several parameters to this function, i.e. the Table, Columns, filters, and other extraction metadata. The /CELONIS/EX_NEW then creates a Background Job, scheduling it to run the program /CELONIS/EXTRACTION.

During an extraction a cursor on the table that is being extracted is opened with a default buffer size of 10,000 lines (the number of lines held in memory at a time). Those lines are written to CSV files in the location specified by the FILE path Z_CELONIS_TARGET. When the CSV file exceeds a specific number of lines (default size: 50,000 lines) it is closed and then - depending on the specified compression type - compressed. Afterwards the extraction continues writing the next file.

The Extractor actively pools the directory Z_CELONIS_TARGET via the RFC function /CELONIS/EX_LIST_FILES. As soon as it identifies a CSV file in the directory, the function /CELONIS/EX_UPLOAD_POLL is invoked to fetch the file from the SAP file system to the extractor server. Once the transfer is completed, the file is removed from the SAP file system via the RFC call to /CELONIS/EX_CLEANUP. If the user cancels the job, the Extractor calls /CELONIS/EX_CANCEL and cancels the Job in SAP.

While the extractions is in progress, the Extractor actively polls the Background Job status via calling /CELONIS/EX_STATUS. Once the Job is completed, the Extractor fetches the Job Logs via the /CELONIS/EX_READ_LOG and publishes it to the Celonis as a part of the extraction logs.

How does a filter work

Filters are appended on the select statement of the cursor. It is important to note that arbitrary filters are possible on the tables meaning that the performance of an extraction can be influenced greatly by correct filtering on indexed columns.

How does a join work

Joins are not translated directly to the select query but instead each join will use its own cursor. After a chunk of the join partner has been read we use a FOR ALL ENTRIES statement for selecting from the target table. Additional filters are appended as usual.

Other RFC Operations
Selecting Tables/Columns to Extract

The user can select which tables and columns to extract in the Celonis Cloud. The table list is displayed to the user via calling the functions /CELONIS/EX_LOOKUP_TABLES and /CELONIS/EX_LIST_TABLES. To list the columns the function /CELONIS/EX_GET_METADATA is called.

Logging System Information

We log several system level parameters for our internal logging purposes, i.e. NetWeaver version, Database type, etc... The function /CELONIS/EX_SYSINFO is invoked to fetch this info before the start of each extraction.

Programs

The following Programs are included in the package /CELONIS/DATA_EXTRACTION:

  • /CELONIS/EXTRACTION: This is the program that executes the extraction. It is executed via a background job.

  • /CELONIS/CLEANUP_CL_TABLE: This program is called to execute the clean up of Log tables. It is relevant only for real time extractions.

  • /CELONIS/CLEANUP: This program cleans up the directory Z_CELONIS_TARGET from the obsolete files. Its execution should be set up manually via a background job.

Function modules

The following Function Modules are included in the package /CELONIS/DATA_EXTRACTION.

  • /CELONIS/EX_CANCEL: Cancels a running extraction

  • /CELONIS/EX_CLEANUP: Cleans up old extraction files

  • /CELONIS/EX_CONFIGURATION_TEST: Tests the functionality of the extraction

  • /CELONIS/EX_GET_METADATA: Gets the metadata of the extraction tables

  • /CELONIS/EX_LIST_FILES: Lists extraction files

  • /CELONIS/EX_LIST_TABLES: Lists extraction tables

  • /CELONIS/EX_LOOKUP_TABLES: Looks up the table data in DD02L

  • /CELONIS/EX_NEW: Starts a new extraction

  • /CELONIS/EX_READ_LOG: Reads the Job Log of a extraction

  • /CELONIS/EX_STATUS: Gets the current status of a extraction

  • /CELONIS/EX_SYSINFO: Looks up the SAP system information

  • /CELONIS/EX_UPLOAD_POLL: Exports the extracted csv file to the Extractor server

  • /CELONIS/EX_UPLOAD: Uploads a result file to the connector [Deprecated after the RFC version 2019-06-14]

  • /CELONIS/EX_CL_GET_TABLE: Get change log table

  • /CELONIS/EX_CL_NEW: Starts a new change log extraction

  • /CELONIS/EX_CL_SET_EXTRACTED: Sets changes to extracted

Used external functionality

We use a few standard sap function modules in our extraction flow:

  • CALCULATE_HASH_FOR_CHAR: For pseudonymization functionality. SHA1 algorithm is used by default.

  • VIEW_AUTHORITY_CHECK: To check for read access on a table that is being extracted

  • DDIF_FIELDINFO_GET: To read metadata of a table

  • BP_JOB_ABORT: Cancelling a running job

  • JOB_OPEN, JOB_CLOSE: Submitting a background job

  • BP_JOBLOG_READ: Reading the log of a background job

  • BP_JOB_STATUS_GET: Getting the status of a background job

  • SPLIT_LINE: Splitting lines of log messages

  • C_SAPGPARAM: Querying os parameters like directory separator character

  • FILE_GET_NAME_USING_PATH: Building the path to the chunk files

  • SXPG_COMMAND_EXECUTE: Executing the compression commands (if applicable)

  • CL_HTTP_CLIENT: Instance methods are used to send the result files to the connector

  • EPS_GET_DIRECTORY_LISTING: List files in the output directory