Skip to main content

Celonis Product Documentation

TIMELINE_COLUMN - TIMELINE_TABLE
Description

The TIMELINE_COLUMN and TIMELINE_TABLE functions create a continuous timeline for a given time unit (e.g., days). They can be used to aggregate values from one or multiple tables onto one common time axis. This makes it easy to show the trend of a KPI over a period of time.

The TIMELINE functions create a temporary table in the background (the timeline table) with all timestamps of the specified unit in a defined range. The difference is that TIMELINE_COLUMN returns the timestamp column of the temporary timeline table, and TIMELINE_TABLE references the full timeline table. Therefore, the use of the functions depends on whether a column or a table is required. As an example, inside PU-functions or for COUNT_TABLE, the TIMELINE_TABLE function must be used.

It is possible to specify a column based partitioning, in which case the time range is added for each partition.

The timeline table is always sorted by the timestamp column. If a partition is specified, the timestamps within each partition are sorted.

Syntax

TIMELINE_COLUMN and TIMELINE_TABLE have the same arguments:

 TIMELINE_COLUMN ( timeunit, time_columns [, partition ] [, time_range] )
 TIMELINE_TABLE ( timeunit, time_columns [, partition ] [, time_range] )
  • timeunit: YEARS, QUARTERS, MONTHS, WEEKS, DAYS, HOURS, MINUTES

  • time_columns: Zero or more timestamp columns, which should participate in the timeline table. They are propagated to all other occurrences of TIMELINE_COLUMN and TIMELINE_TABLE in the same query.

  • partition: PARTITION [ partition_configuration ] BY ( partition_column AS alias, ... )

    • partition_column AS alias: One or more optional columns to specify the partitioning of the timeline. The alias is needed to be able to access the respective partition column in the timeline table.

    • partition_configuration: One of the following options:

      • FIRST_OCCURRENCE TO LAST_OCCURRENCE. For every partition, adds missing timestamps between the first and last value of the partition (default).

      • FIRST_OCCURRENCE TO TIMELINE_END. For every partition, adds missing timestamps between the first value of the partition and the end of the timeline.

      • TIMELINE_START TO LAST_OCCURRENCE. For every partition, adds missing timestamps between the start of the timeline and the last value of the partition.

      • TIMELINE_START TO TIMELINE_END. For every partition, adds missing timestamps in the full time range of the timeline.

  • time_range: Manually sets the time range for the timeline using DATE constants. One of:

    • FROM ( date_constant ): Specifies the start of the timeline.

    • TO ( date_constant ): Specifies the end of the timeline.

    • FROM ( date_constant ), TO ( date_constant ): Specifies start and end of the timeline.

TIMELINE result

TIMELINE_COLUMN and TIMELINE_TABLE both create a temporary timeline table, which contains one timeline column. This timeline column contains all distinct rounded timestamps inside the given range. Inside the range, no timestamp based on the time unit is missing. Timestamps that are not part of the input column(s) are filled in automatically. All rows are distinct.

Partitioning

The partition columns specify for which groups the timeline is created. The timeline for each group is added to the timeline table. If no time range is provided, the start and end timestamp of the timeline are determined by the minimum and maximum timestamp among all groups (i.e. the smallest and largest timestamp among all contributing timestamp columns).

The timeline table stays sorted by the timestamps in the partitions. Because of this, window functions like LEAD and LAG can be used without having to specify an ORDER BY. The partition configuration specifies which values are filled in for each partition.

Row limit

The overall maximum number of output rows is limited to 100 million rows. When using PARTITION BY with a lot of groups, this limit can be reached easily since the timeline is added for each group. If the generated table exceeds this limit, the TIMELINE query needs to be adapted by using the FROM and TO options to further restrict the timeline, or by choosing a different time unit.

Configuration Propagation

The TIMELINE configuration is propagated to all other occurrences of TIMELINE_COLUMN and TIMELINE_TABLE in the same query. Some configurations can be conflicting, which would result in different timeline tables. Other configurations are not conflicting and will be merged.

Conflicting configurations

Conflicting configurations lead to different timeline tables if different conflicting configurations occur inside the same query. Since different timeline tables cannot be joined, this usually leads to "No common table" errors. Therefore, inside one query, it is best practice to specify the configuration only once inside one TIMELINE_COLUMN or TIMELINE_TABLE occurrence. The configuration propagation will propagate this configuration to all other occurrences of TIMELINE_COLUMN and TIMELINE_TABLE.

The following configurations are conflicting:

  • timeunit: It is not possible to have different time units in the same timeline table.

  • partition: The PARTITION BY statement needs to be written completely once; different PARTITION BYs from different TIMELINE_COLUMN or TIMELINE_TABLE occurrences will not be merged.

  • time_range: It is not possible to have different time ranges in the same timeline table.

Merge-able configurations

This configuration is merge-able:

  • time_columns: All occurring timestamp columns from different TIMELINE_COLUMN and TIMELINE_TABLE occurrences in the same query will be merged. This means that all of those time columns contribute to the same timeline table.

NULL handling

NULL values in the input timestamp columns behave as if they were not there. That means that they do not influence the range of the timeline, and there will be no NULL value inside the timeline column. When all input columns are empty or all values of all input columns are NULL, then the resulting timeline has zero rows, except when a range is given by the FROM and TO parameters.

This also holds when the timeline is partitioned: Partitions where the range for that partition cannot be calculated because there are no (non-null) timestamp values in the input timestamp column, and there is no time range given by FROM and TO, the partition will not be part of the resulting timeline table.

Nesting

It is not allowed to nest multiple TIMELINE_COLUMN or TIMELINE_TABLE occurrences, i.e. it is not allowed to apply TIMELINE_COLUMN or TIMELINE_TABLE on another timeline table.

Joins

The timeline table cannot be joined to the rest of the Data Model. But the TIMELINE_TABLE function can be used as a target inside a PU-function to aggregate values from the data model onto the timeline.

Usage of TIMELINE_TABLE inside PU-functions

The timeline table is a table with distinct rows. Therefore, it is possible to use TIMELINE_TABLE inside PU-functions (similar to DOMAIN_TABLE) to aggregate values of the Data Model onto the timeline. This enables the calculation of different KPIs on the same timeline.

When using TIMELINE_TABLE as a target inside a PU-function, a timestamp column must be set inside the TIMELINE_TABLE call. The source values will be aggregated based on the timestamps of this timestamp column. If multiple timestamp columns are specified, the first one is taken as the target reference.

Filter propagation

Filters on tables from the Data Model are not propagated to the timeline table to ensure a continuous time axis even if filters are set. It is also not possible to use filter-aware functions (like FILTER_TO_NULL) inside any input argument of TIMELINE_COLUMN and TIMELINE_TABLE.

However, FILTER_TO_NULL can be used inside the PU-functions to only aggregate values that fulfill the filter condition to the timeline. By default, PU-functions ignore filters.

Filter propagation from the timeline table to the other tables from the Data Model is currently not supported. Filters which are applied directly to the timeline table only affect the timeline table itself, as well as potentially other temporary tables that are derived from the timeline table, e.g. DOMAIN_TABLEs.

[1]

Missing timestamps are filled in and the timestamps are rounded to the given time unit, i.e. days in this case. Since no time range is given, the minimum and maximum timestamp are the start and end date:

Query

Column1

         TIMELINE_COLUMN ( DAYS , "Table1"."Timestamp" )
        

Input

Output

Table1

Timestamp : date

Sat Jan 01 2022 00:00:00.000

Mon Jan 03 2022 00:00:00.000

Mon Jan 03 2022 00:00:00.000

Wed Jan 05 2022 00:00:00.000

Result

Column1 : date

Sat Jan 01 2022 00:00:00.000

Sun Jan 02 2022 00:00:00.000

Mon Jan 03 2022 00:00:00.000

Tue Jan 04 2022 00:00:00.000

Wed Jan 05 2022 00:00:00.000

[2]

Using the time range one can crop and extend the timeline between two given dates:

Query

Column1

         TIMELINE_COLUMN ( DAYS , "Table1"."Timestamp" , FROM ( {d '2022-01-03' } ) , TO ( {d '2022-01-06' } ) )
        

Input

Output

Table1

Timestamp : date

Sat Jan 01 2022 00:00:00.000

Mon Jan 03 2022 00:00:00.000

Mon Jan 03 2022 00:00:00.000

Wed Jan 05 2022 00:00:00.000

Result

Column1 : date

Mon Jan 03 2022 00:00:00.000

Tue Jan 04 2022 00:00:00.000

Wed Jan 05 2022 00:00:00.000

Thu Jan 06 2022 00:00:00.000

[3]

This example shows how to calculate a KPI on a timeline using a PU-function. The query calculates the number of activities per day. Days without any activities are also contained in the result.

The example also shows how the configuration propagation works: In the first dimension, which returns the timeline column, only the time unit DAYS is specified. In the second dimension, the origin timestamp column is specified inside TIMELINE_TABLE. In the background, one single timeline table is created from both specifications:

Query

Column1

         TIMELINE_COLUMN ( DAYS )
        

Column2

         PU_COUNT ( TIMELINE_TABLE ( "Activities"."Timestamp" ) , "Activities"."Activity" )
        

Input

Output

Activities

CaseId : int

Activity : string

Timestamp : date

1

'A'

Sat Jan 01 2022 01:00:00.000

1

'B'

Sun Jan 02 2022 10:00:00.000

1

'C'

Tue Jan 04 2022 04:00:00.000

2

'A'

Sat Jan 01 2022 01:00:00.000

2

'B'

Tue Jan 04 2022 08:00:00.000

2

'C'

Tue Jan 04 2022 11:00:00.000

2

'D'

Thu Jan 06 2022 07:00:00.000

Result

Column1 : date

Column2 : int

Sat Jan 01 2022 00:00:00.000

2

Sun Jan 02 2022 00:00:00.000

1

Mon Jan 03 2022 00:00:00.000

0

Tue Jan 04 2022 00:00:00.000

3

Wed Jan 05 2022 00:00:00.000

0

Thu Jan 06 2022 00:00:00.000

1

[4]

This example shows how KPIs from different tables can be brought to the same timeline. The Data Model contains two Activity tables, which are not connected. For each Activity table, the number of activities per day is calculated.

The query is similar to the one from the previous example, the only difference is that now, there is one additional dimension which calculates the KPI for the second Activity table. Since the timestamp columns specified in both TIMELINE_TABLE occurrences are merged into one configuration, both timestamp columns contribute to the same timeline table.

Query

Column1

         TIMELINE_COLUMN ( DAYS )
        

Column2

         PU_COUNT ( TIMELINE_TABLE ( "Activities1"."Timestamp" ) , "Activities1"."Activity" )
        

Column3

         PU_COUNT ( TIMELINE_TABLE ( "Activities2"."Timestamp" ) , "Activities2"."Activity" )
        

Input

Output

Activities1

CaseId : int

Activity : string

Timestamp : date

1

'A'

Sat Jan 01 2022 01:00:00.000

1

'B'

Sun Jan 02 2022 10:00:00.000

1

'C'

Tue Jan 04 2022 04:00:00.000

2

'A'

Sat Jan 01 2022 01:00:00.000

2

'B'

Tue Jan 04 2022 08:00:00.000

2

'C'

Tue Jan 04 2022 11:00:00.000

2

'D'

Thu Jan 06 2022 07:00:00.000

Activities2

CaseId : int

Activity : string

Timestamp : date

100

'X'

Mon Jan 03 2022 01:00:00.000

100

'Y'

Mon Jan 03 2022 10:00:00.000

100

'Z'

Tue Jan 04 2022 04:00:00.000

200

'X'

Thu Jan 06 2022 01:00:00.000

200

'Y'

Fri Jan 07 2022 08:00:00.000

Result

Column1 : date

Column2 : int

Column3 : int

Sat Jan 01 2022 00:00:00.000

2

0

Sun Jan 02 2022 00:00:00.000

1

0

Mon Jan 03 2022 00:00:00.000

0

2

Tue Jan 04 2022 00:00:00.000

3

1

Wed Jan 05 2022 00:00:00.000

0

0

Thu Jan 06 2022 00:00:00.000

1

1

Fri Jan 07 2022 00:00:00.000

0

1

[5]

In this simple example, the timeline table is partitioned by the Country. For each partition, the timeline column contains all timestamps between the smallest and largest timestamp from that partition.

Since the TIMELINE_TABLE occurrence is not used as a target table inside a PU-function, but used to access the partition column of the TIMELINE_TABLE, it is not necessary to specify a timestamp column inside the TIMELINE_TABLE function. All configurations are placed inside the TIMELINE_COLUMN function, and are propagated to the TIMELINE_TABLE function:

Query

Column1

         TIMELINE_COLUMN ( DAYS , "Table1"."Timestamp" , PARTITION BY ( "Table1"."Country" AS "Country" ) )
        

Column2

         TIMELINE_TABLE()."Country"
        

Input

Output

Table1

Timestamp : date

Country : string

Fri Jan 01 2021 00:00:00.000

'DE'

Sun Jan 03 2021 00:00:00.000

'DE'

Sun Jan 03 2021 00:00:00.000

'US'

Mon Jan 04 2021 00:00:00.000

'DE'

Thu Jan 07 2021 00:00:00.000

'US'

Result

Column1 : date

Column2 : string

Fri Jan 01 2021 00:00:00.000

'DE'

Sat Jan 02 2021 00:00:00.000

'DE'

Sun Jan 03 2021 00:00:00.000

'DE'

Mon Jan 04 2021 00:00:00.000

'DE'

Sun Jan 03 2021 00:00:00.000

'US'

Mon Jan 04 2021 00:00:00.000

'US'

Tue Jan 05 2021 00:00:00.000

'US'

Wed Jan 06 2021 00:00:00.000

'US'

Thu Jan 07 2021 00:00:00.000

'US'

[6]

Same scenario as above, but with the FIRST_OCCURRENCE TO TIMELINE_END partition configuration. With this configuration, the timeline for each partition ranges from the smallest timestamp of the partition to the end of the timeline.

Since the end of the time range is not specified using the TO keyword, the timeline end is defined by the largest timestamp from the input column (independent of the partition).

Query

Column1

         TIMELINE_COLUMN ( DAYS , "Table1"."Timestamp" , PARTITION FIRST_OCCURRENCE TO TIMELINE_END BY ( "Table1"."Country" AS "Country" ) )
        

Column2

         TIMELINE_TABLE()."Country"
        

Input

Output

Table1

Timestamp : date

Country : string

Fri Jan 01 2021 00:00:00.000

'DE'

Sun Jan 03 2021 00:00:00.000

'DE'

Sun Jan 03 2021 00:00:00.000

'US'

Mon Jan 04 2021 00:00:00.000

'DE'

Thu Jan 07 2021 00:00:00.000

'US'

Result

Column1 : date

Column2 : string

Fri Jan 01 2021 00:00:00.000

'DE'

Sat Jan 02 2021 00:00:00.000

'DE'

Sun Jan 03 2021 00:00:00.000

'DE'

Mon Jan 04 2021 00:00:00.000

'DE'

Tue Jan 05 2021 00:00:00.000

'DE'

Wed Jan 06 2021 00:00:00.000

'DE'

Thu Jan 07 2021 00:00:00.000

'DE'

Sun Jan 03 2021 00:00:00.000

'US'

Mon Jan 04 2021 00:00:00.000

'US'

Tue Jan 05 2021 00:00:00.000

'US'

Wed Jan 06 2021 00:00:00.000

'US'

Thu Jan 07 2021 00:00:00.000

'US'

[7]

This example shows that a filter on a table from the Data Model is not propagated to the timeline table. Even though the FILTER statement filters on the source timestamp column of the timeline, the timeline column still shows the full time range.

By using FILTER_TO_NULL inside the PU_COUNT, the filter is still taken into account in the actual PU_COUNT calculation, which returns 0 for values that do not pass the filter condition:

Query

Filter

         FILTER "Table1"."Timestamp" < {d '2022-01-04' };
        

Column1

         TIMELINE_COLUMN ( DAYS )
        

Column2

         PU_COUNT ( TIMELINE_TABLE ( "Table1"."Timestamp" ) , FILTER_TO_NULL ( "Table1"."Timestamp" ) )
        

Input

Output

Table1

Timestamp : date

Sat Jan 01 2022 00:00:00.000

Mon Jan 03 2022 00:00:00.000

Mon Jan 03 2022 00:00:00.000

Wed Jan 05 2022 00:00:00.000

Result

Column1 : date

Column2 : int

Sat Jan 01 2022 00:00:00.000

1

Sun Jan 02 2022 00:00:00.000

0

Mon Jan 03 2022 00:00:00.000

2

Tue Jan 04 2022 00:00:00.000

0

Wed Jan 05 2022 00:00:00.000

0

[8]

This example illustrates the behavior of PU-functions when aggregating on a TIMELINE that is created from two timestamp columns of two connected tables. In the example query, the first column returns the timeline column of the timeline table. Inside the TIMELINE_COLUMN statement, the time unit to be used is specified (DAYS). The participating timestamp columns are specified in the TIMELINE_TABLE calls from the following four dimensions. These four dimensions, all created by PU_SUM statements, behave as follows:

  1. The first PU_SUM aggregates all values from Table1 to the timeline based on the according timestamp from Table1.

  2. The second PU_SUM aggregates all values from TableN to the timeline based on the according timestamp from TableN.

  3. The third PU_SUM aggregates all values from TableN to the timeline based on the according timestamp from Table1. Although the values of TableN are aggregated, and the timestamp column of TableN participates in the timeline, the result of this PU_SUM is based on the timestamp column of Table1 since this is specified inside the TIMELINE_TABLE call.

  4. The fourth PU_SUM returns the same result as the third one. In this TIMELINE_TABLE occurrence, both timestamp columns are specified. Since all timestamp column configurations of all TIMELINE_TABLE or TIMELINE_COLUMN occurrences of a query are merged, this is in the end the same configuration as in all other TIMELINE_TABLE occurrences. However, the order of the timestamp columns is important: The value will be aggregated according to the first timestamp column that is specified. In this case, the value of TableN is aggregated based on the according timestamp from Table1.

Query

Column1

         TIMELINE_COLUMN ( DAYS )
        

Column2

         PU_SUM ( TIMELINE_TABLE ( "Table1"."Timestamp" ) , "Table1"."Value" )
        

Column3

         PU_SUM ( TIMELINE_TABLE ( "TableN"."Timestamp" ) , "TableN"."Value" )
        

Column4

         PU_SUM ( TIMELINE_TABLE ( "Table1"."Timestamp" ) , "TableN"."Value" )
        

Column5

         PU_SUM ( TIMELINE_TABLE ( "Table1"."Timestamp" , "TableN"."Timestamp" ) , "TableN"."Value" )
        

Input

Output

Table1

ID : int

Value : float

Timestamp : date

1

1.1

Fri Jan 01 2021 00:00:00.000

2

2.2

Mon Jan 04 2021 00:00:00.000

3

3.3

Sun Jan 03 2021 00:00:00.000

4

4.4

Thu Jan 07 2021 00:00:00.000

TableN

Table1ID : int

Value : int

Timestamp : date

1

1000

Sun Jan 03 2021 00:00:00.000

1

500

Sun Jan 03 2021 01:00:00.000

1

40

Tue Jan 05 2021 02:00:00.000

4

111

Tue Jan 05 2021 00:00:00.000

4

222

Tue Jan 05 2021 01:00:00.000

Foreign Keys

Table1.ID

TableN.Table1ID

Result

Column1 : date

Column2 : float

Column3 : int

Column4 : int

Column5 : int

Fri Jan 01 2021 00:00:00.000

1.1

null

1540

1540

Sat Jan 02 2021 00:00:00.000

null

null

null

null

Sun Jan 03 2021 00:00:00.000

3.3

1500

null

null

Mon Jan 04 2021 00:00:00.000

2.2

null

null

null

Tue Jan 05 2021 00:00:00.000

null

373

null

null

Wed Jan 06 2021 00:00:00.000

null

null

null

null

Thu Jan 07 2021 00:00:00.000

4.4

null

333

333