Troubleshooting transformations

Here's troubleshooting tips for when you're creating and running transformations for object-centric process mining. The main documentation for transformations is at Creating custom transformations.

Troubleshooting object-centric process mining lists all the troubleshooting topics to help you create object-centric data models.

The OCPM data job (ocpm-data-job) is running more slowly than I expected

Performance issues with data jobs can have a variety of root causes. Try these checks as a starting point:

Is smart ETL activated for the OCPM data job (ocpm-data-job)? To check, click on the context menu (the three vertical dots) for the OCPM data job, select Execution Settings, and check whether both toggles are activated. By default, they should be.
Are you using multiple layers of views to pre-process the data? If you decided to pre-process your data in views as described in I want to pre-process my raw data before transforming it into objects and events, keep in mind that a view is a stored SQL query that needs to be executed on every occurrence. You can think of this like a table that needs to be rebuilt with every reference of the view. If you experience performance issues in the OCPM data job (ocpm-data-job) with this setup, try using tables instead of views in the pre-processing scope, and use ANALYZE STATISTICS for the created tables. Optimization guide has more tips for optimizing the performance of tables and views.
Does your use case need multiple years of historical data? If you think your performance issue could just be a result of large data volumes, consider limiting the data scope to only a subset, as explained in I only want to create objects and events for a limited data scope, for example only one year of data.

I relied on Vertica-specific SQL functions in the past that are not supported for object-centric process mining

To keep the SQL transformations as generic as possible, the SQL editor in the Objects and Events environment only supports a subset of functions and operators of ANSI SQL. The list of supported functions is different from that supported for case-centric data models. You can find the list of allowed SQL functions for object-centric process mining at Supported functions for object-centric transformations. The topic includes advice on how to replace Vertica-specific functions with suitable alternatives.

The data I need is in a different field to the one expected by the Celonis-provided transformations from the catalog

Let’s look at a concrete example - in this case, the delivery is in the field VBFA.VBELN, and the delivery item number is in the field VBFA.POSNN, instead of the standard locations MSEG.VBELN_IM and MSEG.VBELP_IM that the Celonis-supplied transformations expect. You have two options:

If you don’t have the right data in MSEG.VBELN_IM and MSEG.VBELP_IM, you can create an overwrite for the DeliveryItem_ID attribute in the transformation for the OutgoingMaterialDocumentItem object type, and join the MSEG table with the VBFA table to get the missing fields. In this case, you only need to overwrite the DeliveryItem_ID attribute.
You can directly overwrite the factory for event relationships with Delivery Items.
- Use a partial overwrite to overwrite individual attributes of the type. The overwrite query is joined to the query in the Celonis transformation with a left outer join.
- Or use a full overwrite to completely replace the Celonis transformation for the object type or event type. A full overwrite can change IDs, attributes, joins, and filters.
Important
Creating a full overwrite of the transformation for a Celonis object type or event type means that the transformation won’t be automatically updated with future improvements to the Celonis catalog. For example, if a new property is added to a Celonis object type to support a new app for a business process, the new property won’t get populated in an object-centric data model where the Celonis-supplied transformation for the object type has been fully overwritten, and the app won’t work. If you do use a full overwrite, be prepared to manually monitor and implement the future updates to the Celonis object type or event type. As a rule, doing a partial overwrite is preferable to a full overwrite whenever possible.

I have added a new custom attribute to a Celonis-supplied standard object type, but I can’t find it in the object’s transformation or the object type table

When you add custom attributes to a Celonis object type, you’ll need to define their transformations by creating a custom script CustomAttribute in the folder Properties > Custom Properties in the transformation for the object type. The single script holds the transformations for all the custom attributes that you added for the object type.

The transformation for a custom attribute needs to use the same definition for the ID property as the main object type definition.

When you publish your changes, the transformation for your custom attributes is published as a separate transformation in the OCPM data job (ocpm-data-job) called transformation_FIELD_EXTENSION_custom_<Object Name>_celonis_<Object Name>.

The result table with the data for custom attributes is not written to the main object type table in the OCPM scope. Instead, the values are written to a separate table called x_custom_celonis_<Object Name> which is also located in the OCPM scope. The table with the data for the custom attributes and the main object type table are automatically merged into one table during the data model load, so the complete object type can be viewed in the perspective.

I am running the OCPM data job (ocpm-data-job) and getting the error message “Error while executing transformation… ERROR: Column Table.Column does not exist”

This error message means that your extracted source system data is missing one or more columns that are required by the Celonis-supplied transformations.

Sometimes the error message reports a column as missing that isn’t used in the named transformation. This can happen if you're using an imported Data Connection from another Data Pool. When you import a Data Connection to the OCPM Data Pool (as described in I have already extracted data into an existing Data Pool that is not the OCPM Data Pool), we create a view that individually selects all columns from the imported table. If one of the selected columns no longer exists, for example because it has been excluded from the extraction settings for the source table, the error message is displayed for any transformation that references the view containing the imported data. If this is the case, you can fix the error by synchronizing the imported connection as described in Sharing data between data pools.

If the columns are used in the named transformation, the best solution is to add the missing columns to your extraction configuration and do a full reload of the tables. If you cannot do a full reload, for example because of archival rules in SAP, you can still add the missing columns to the extraction configuration so that the transformations can run. Note that the new fields will not be filled retrospectively for any records that were extracted before the extraction configuration was updated. This might also have an impact on an analysis based on a perspective that uses the affected objects and events.

I am running the OCPM data job (ocpm-data-job) and getting the error message “Error while executing transformation … with id …: ERROR: Cannot set a NOT NULL column (ID) to a NULL value in INSERT/UPDATE statement”

This error message probably means that the ID (primary key) is NULL in some of the records that are being inserted into the object type table. NULL values are not allowed for the primary key for an object type. If any of the columns that are concatenated to build the ID contains a NULL value, then the entire resulting ID is NULL.

Check the transformation referenced in the error message and investigate the source data. Most likely, at least one of the columns used contains NULL values. These values can either be eliminated during the table extraction, or removed or manipulated in a data pre-processing step using the COALESCE() function. I have NULL values in ID attributes explains how to do this.