Skip to main content

Celonis Product Documentation

Troubleshooting the Duplicate Checker app

A single data model can only be used with one Duplicate Checker package or the Accounts Payable Starter Kit. If there is more than one Duplicate Checker package or Accounts Payable Starter kit pointed to the same data model, they will fail with an error saying that there is already another DC implementation running on that data model.

For this augmented table to be generated, the table “GROUPS_DUPLICATE_INVOICES” needs to exist in the data pool and data model. Ensure that the sensor has run successfully and that the table exists. Next, make a change to the knowledge model (e.g. by adding a space character after a description) and publish the package. Check the details of the current data model load and verify that the “DUPLICATE_GROUP_AUGMENTED_STATUS” is listed there. If this is not the case, create a Service Desk ticket.

Make sure that the corresponding component is enabled in the backend via a Service Desk ticket (see all necessary backend flags in Prerequisites).

Perform a change in the knowledge model (e.g. by adding a space character after a description) and publish the package afterwards. This should trigger the ML Sensor.

Error Message

Possible root cause

What to do

The data pipeline was deleted and the data model reloaded due to a configuration change in this skill or the underlying knowledge model. Everything will be recreated with the updated configurations with the next run of this skill. This will happen automatically. Please wait.

This happens when a user changes the configuration, either in the skill itself or in the PQL definitions in the knowledge model.

Wait until the next job has finished.

Could not connect to the Query Engine. Please go to Data Integration and check if the data model is loaded.

The data model is not loaded. This happens if something goes wrong with the reload e.g. failed table join.

Check the failure reason in Data Integration and reload the data model.

There is an error in the PQL of the knowledge model filter with ID: {pql_filter.id}. and PQL: {pql_filter.pql}. Please correct the PQL definition in the knowledge model. Error Message:{msg}.

Either the user has introduced a PQL error in the knowledge model or (more often) the user uses standard PQL definitions which do not match the underlying data model.

Correct the PQL definitions in the knowledge model.

There is an error in the PQL of the selected attribute with ID: {pql_column.id} and PQL: {pql_column.pql}. Please correct the PQL definition in the knowledge model. Error Message: {msg}

Either the user has introduced a PQL error in the knowledge model or (more often) the user uses standard PQL definitions which do not match the underlying data model.

Correct the PQL definitions in the knowledge model.

No duplicates found with the current selection. Either there are no duplicates in the selected data model, or the selected filters are too restrictive. Please revise the filters.

Usually caused by a filter which is too restrictive.

Correct the PQL filter or reversal definition in the knowledge model.

0 rows selected for duplicate checking. Either there exists no data in the selected data model, or the selected filters are too restrictive. Please revise the filters.

Usually caused by a filter which is too restrictive.

Correct the PQL filter or reversal definition in the knowledge model.

Knowledge model with key {km_key_only} cannot be found in package with key {package_key}. Make sure it exists and the application key (name: duplicate-checking) used by this job has the right permissions.

Usually this happens if the user edits the application key configuration used by the job (duplicate-checking), either by deleting it or removing permissions.

Check if the assets exist and that the application key has the right to access the package. If the key does not exist, raise a Service Desk ticket.

Package with key {package_key} cannot be found. Make sure it exists and the application key (name: duplicate-checking) used by this job has the right permissions.

Usually this happens if the user edits the application key configuration used by the job (duplicate-checking), either by deleting it or removing permissions.

Check if the assets exist and that the application key has the right to access the package. If the key does not exist, raise a Service Desk ticket.

Couldn't find data model with id: {datamodel_id}. Make sure it exists and the application key (name: duplicate-checking) used by this job has the right permissions.

Usually this happens if the user edits the application key configuration used by the job (duplicate-checking), either by deleting it or removing permissions.

Check if the assets exist and that the application key has the right to access the package. If the key does not exist, raise a Service Desk ticket.

Job execution failed. Contact support. Reference id: 004e6e97-94a4-4c8c-9bef-05d39e3167b3

There is an internal server error.

Please raise a Service Desk ticket and provide the Reference id.

First, make sure that the “Celonis: Create Task” backend flag is enabled. Then use the steps below to create these group-level tasks:

  1. Within Studio, open to the duplicate checker package.

  2. Find and expand the "Group Level Tasks" folder.

  3. Select the "Group Level Tasks" skill.

  4. Click Edit.

  5. Click Add Filter and then select "Create New Filter".

  6. Enter a Filter ID and Filter Display Name.

  7. In the Editor, add a filter that results in 0 signals, such as "BKPF"."BELNR" = '0'.

  8. Click Save.

  9. Save the skill and publish the package.

  10. Repeat Steps 1-4.

  11. Delete the filter.

  12. Save the skill and publish the package.

Make sure that the “Knowledge Model Name” variable (KM_NAME) contains the correct package and knowledge model key (format: package-key.knowledge-key). See Activating augmented attributes in Setting up the full app.

Currently, the runtime of one ML Sensor execution is capped after 12 hours. Executions that exceed this limit will be automatically canceled. The time required for the algorithm to check all documents depends not only the number of documents in scope but most importantly on the complexity of the documents i.e. the number of comparisons.

To be able to estimate the time needed before running the ML Sensor, make use of the following "Runtime Analysis":

Procedure:

  1. Download the YAML file and push the analysis (asset) into the Duplicate Checking package using the Content-CLI.

  2. Connect the knowledge model to the analysis

  3. Make sure that the PQL definitions of the record identifier and the four record attributes match the definitions used in the ML Sensor

  4. Specify the filter conditions used in the ML Sensor

  5. The estimated total runtime can be seen bottom left and is highlighted in case it exceeds the 12 hour timeframe

  6. Use additional filter conditions to reduce the document scope

The package version 2.X is only compatible with the newest algorithm version, and contains breaking changes that require a manual migration for all customized PQL statements. Please create a service desk ticket to activate the newest algorithm on a per EMS team basis and to receive instructions on how the migration will be performed.The quickest way to get the views working again is to revert the package to a published package version before the upgrade of the package dependency.

With version 2.X, the result table is appended and changing filters in the ML Sensor does not reset the pipeline. For more information, see Components and data flow. This is to ensure that already identified groups do not get overwritten. To remove entries in the “DUPLICATE_INVOICES” table, it is necessary to manually delete the rows in Data Integration via a “DELETE FROM TABLE” SQL statement. The “RUN_ID” is a helpful selection criteria.

With version 1.X, a document can only be part of one group at a time. Through the upsert mechanism, it can happen that documents form a new group, overwriting the initial group id. Please upgrade to version 2.X where the group information is persisted by appending the table, instead of upserting. For more information, see Components and data flow.

With version 1.X, the “_CELONIS_CHANGE_DATE” is not a reliable field to determine the detection date of a group as it can change over time. Please upgrade to version 2.X and use the newly added “GROUP_CREATION_DATE” field as the detection date.