Skip to main content

Celonis Product Documentation

Activity table sorting

Description

This article gives an overview of how the sorting of the activity table is defined in Celonis.

Sorting Column

An activity table consists of at least a case id, activity, and timestamp column. A sorting column is optional. The sorting column is used if two activities inside a case have the same timestamp. Then the order of the activities is determined by the sorting column. The activity with the lower sorting value is regarded as the activity which happens first.

Example

[1]

In this example, the timestamps of activities B and C are equal. Those activities are therefore ordered by their sorting value, which is why C is regarded to happen before B:

Query

Column1

         "ActivityTable"."ACTIVITY"
        

Column2

         "ActivityTable"."TIMESTAMP"
        

Column3

         "ActivityTable"."SORTING"
        

Input

Output

ActivityTable

CASE : int

ACTIVITY : string

TIMESTAMP : date

SORTING : int

1

'A'

Sun Jan 01 2017 15:31:00.000

10

1

'B'

Sun Jan 01 2017 16:00:00.000

20

1

'C'

Sun Jan 01 2017 16:00:00.000

15

1

'D'

Sun Jan 01 2017 16:45:00.000

10

Result

Column1 : string

Column2 : date

Column3 : int

'A'

Sun Jan 01 2017 15:31:00.000

10

'C'

Sun Jan 01 2017 16:00:00.000

15

'B'

Sun Jan 01 2017 16:00:00.000

20

'D'

Sun Jan 01 2017 16:45:00.000

10

Sorting is important! This might seem like a negligible corner case, but is actually very important. If two activities belong to the same case and have the same timestamp without any configured sorting, the resulting process graph is undefined.

Without setting the sorting column in the example above, this can result in the following process variants:

  • A → B → C → D

  • A → C → B → D

Which process variant is used can vary for each reload. That affects all process-based operators, including Variant and Process Explorer. By using filters based on process edges, even all KPIs can be affected and become unstable.

To not run into unexpected results, you should first ensure that, if available, the sorting column is configured in the data model editor. Second check if each activity type has a unique sorting id. You may get a well-defined process model even though activity types share the same sorting ids. However, this is much harder to verify.

In Celonis, a warning will be shown in the data model load status if the sorting id is not unique per activity type.

Day-based activities

Certain activity types don't have an exact timestamp. The information at what time of the day the activity is taking place is not available. There can be several reasons for this; for example, the information is not stored in the source system. We call such activities day-based activities. All other activities are named timestamp-based activities. We expect the timestamp-based activities to be the default and the day-based activities rather an exception.

In order to identify day-based activity types, Celonis checks for all activities of an activity type whether the time information is missing across the complete eventlog. So a day-based activity is an activity that that always happens precisely at midnight. We are aware that this could, in theory, lead to a false positive. However, in reality, the chances that an activity type always really takes place precisely at midnight, without a second of deviation, is negligible.

Example

[2]

In this example, activity type A is a day-based activity. In the whole eventlog, A always takes place at midnight. Activity type B is a timestamp-based activity. Activity B does indeed take place at midnight in case 1, however, it has a different timestamp in case 2.

As there is no sorting column defined, there is no special treatment for day-based activities.

Query

Column1

         "ActivityTable"."CASE"
        

Column2

         "ActivityTable"."ACTIVITY"
        

Column3

         "ActivityTable"."TIMESTAMP"
        

Input

Output

ActivityTable

CASE : int

ACTIVITY : string

TIMESTAMP : date

1

'A'

Sun Jan 01 2017 00:00:00.000

1

'B'

Mon Jan 02 2017 00:00:00.000

2

'A'

Mon Jan 02 2017 00:00:00.000

2

'B'

Tue Jan 03 2017 10:45:00.000

Result

Column1 : int

Column2 : string

Column3 : date

1

'A'

Sun Jan 01 2017 00:00:00.000

1

'B'

Mon Jan 02 2017 00:00:00.000

2

'A'

Mon Jan 02 2017 00:00:00.000

2

'B'

Tue Jan 03 2017 10:45:00.000

Ordering of day-based activities

Since we are lacking the information at what exact time of the day a day-based activity takes place, we are relying on the sorting column (if available) to order day-based activities and timestamp-based activities which happen on the same day. As a result, the timestamps of day-based activities need to be changed. The following example explains this behavior in detail:

Example

[3]

Here activity A is a day-based activity. B, C and D also happened on the same day, but they have exact timestamps (timestamp-based activities). Now, Celonis relies on the sorting column to place activity A in the right position.

It tries to move activity A further down the process variant, as long as its sorting value is larger than the sorting value of the following activity.

In the given example, Celonis would perform the following steps:

  1. Compare sorting of A with B → Sorting of A is larger → A and B switch places

  2. Compare sorting of A with C → Sorting of A is smaller → abort

Now the day-based activity is ordered according to its sorting value. However, now a calculation of a throughput time between B and A would result in a negative throughput time. To fix this, we assign the timestamp of B to A and obtain the result activity table below:

Query

Column1

         "ActivityTable"."CASE"
        

Column2

         "ActivityTable"."ACTIVITY"
        

Column3

         "ActivityTable"."TIMESTAMP"
        

Column4

         "ActivityTable"."SORTING"
        

Input

Output

ActivityTable

CASE : int

ACTIVITY : string

TIMESTAMP : date

SORTING : int

1

'A'

Sun Jan 01 2017 00:00:00.000

20

1

'B'

Sun Jan 01 2017 03:00:00.000

15

1

'C'

Sun Jan 01 2017 07:00:00.000

25

1

'D'

Sun Jan 01 2017 10:45:00.000

10

Result

Column1 : int

Column2 : string

Column3 : date

Column4 : int

1

'B'

Sun Jan 01 2017 03:00:00.000

15

1

'A'

Sun Jan 01 2017 03:00:00.000

20

1

'C'

Sun Jan 01 2017 07:00:00.000

25

1

'D'

Sun Jan 01 2017 10:45:00.000

10

Special Case - Day-based activity stays first activity of the day

If the position of a day-based activity doesn't change because it isn't followed by an activity with a lower sorting value, the day-based activity gets the timestamp of the following activity on the same day, if available.

Example

[4]

Here, activity A is a day-based activity. It cannot be moved down as its sorting value is already smaller than the sorting value of the following activity B. Therefore, we assign the timestamp of the following activity B to A:

Query

Column1

         "ActivityTable"."CASE"
        

Column2

         "ActivityTable"."ACTIVITY"
        

Column3

         "ActivityTable"."TIMESTAMP"
        

Column4

         "ActivityTable"."SORTING"
        

Input

Output

ActivityTable

CASE : int

ACTIVITY : string

TIMESTAMP : date

SORTING : int

1

'A'

Sun Jan 01 2017 00:00:00.000

15

1

'B'

Sun Jan 01 2017 03:00:00.000

20

1

'C'

Sun Jan 01 2017 07:00:00.000

25

1

'D'

Sun Jan 01 2017 10:45:00.000

10

Result

Column1 : int

Column2 : string

Column3 : date

Column4 : int

1

'A'

Sun Jan 01 2017 03:00:00.000

15

1

'B'

Sun Jan 01 2017 03:00:00.000

20

1

'C'

Sun Jan 01 2017 07:00:00.000

25

1

'D'

Sun Jan 01 2017 10:45:00.000

10