Skip to main content

Celonis Product Documentation

Machine Utilization in Production

Description

This example shows how to calculate the throughput time of a machine with PQL.

In production use cases machine throughput time is often a number of interest. Calculating this number is not straight forward as the event log is usually built around the products which go through the assembly line and not the machine. Functions which are usually used for calculating the throughput time like SOURCE and TARGET or CALC_THROUGHPUT operate on case level. But a machine is normally involved in many cases. Therefore, LEAD and LAG can be used to calculate the duration of activities through a variety of cases. These functions can also be used to trace errors.

Duration of Stations

The duration of a station can be calculated by comparing the timestamp of an Activity to the timestamp of the next Activity, that can be identified using LEAD. The duration can also be calculated using LAG, but this would return the duration of the previous activity.

[1]

With LEAD, we can compare the timestamps of Activities to the timestamps of their following Activities, that belong to the same Activity type. A function that calculates Time Differences like HOURS_BETWEEN can be used to then return the duration for each Activity.

Query

Filter

         FILTER LEAD ( "Activities"."Timestamp" , PARTITION BY ( "Activities"."Activity" ) ) IS NOT NULL;
        

Column1

         "Activities"."Activity"
        

Column2

         HOURS_BETWEEN ( "Activities"."Timestamp" , LEAD ( "Activities"."Timestamp" , PARTITION BY ( "Activities"."Activity" ) ) )
        

Input

Output

Activities

Case : int

Activity : string

Timestamp : date

1

'A'

Mon Feb 01 2016 01:00:00.000

1

'B'

Mon Feb 01 2016 02:00:00.000

2

'A'

Mon Feb 01 2016 02:00:00.000

2

'B'

Mon Feb 01 2016 03:00:00.000

3

'A'

Mon Feb 01 2016 04:00:00.000

3

'B'

Mon Feb 01 2016 06:00:00.000

Result

Column1 : string

Column2 : float

'A'

1.0

'B'

1.0

'A'

2.0

'B'

3.0

[2]

In this example we can see, how the average duration of each activity type can be calculated.

Query

Filter

         FILTER LEAD ( "Activities"."Timestamp" , PARTITION BY ( "Activities"."Activity" ) ) IS NOT NULL;
        

Column1

         "Activities"."Activity"
        

Column2

         AVG ( HOURS_BETWEEN ( "Activities"."Timestamp" , LEAD ( "Activities"."Timestamp" , PARTITION BY ( "Activities"."Activity" ) ) ) )
        

Input

Output

Activities

Case : int

Activity : string

Timestamp : date

1

'A'

Mon Feb 01 2016 01:00:00.000

1

'B'

Mon Feb 01 2016 02:00:00.000

2

'A'

Mon Feb 01 2016 02:00:00.000

2

'B'

Mon Feb 01 2016 03:00:00.000

3

'A'

Mon Feb 01 2016 04:00:00.000

3

'B'

Mon Feb 01 2016 06:00:00.000

Result

Column1 : string

Column2 : float

'A'

1.5

'B'

2.0

[3]

This is an example how to calculate the duration of one activity at its station with the usage of LAG. As B could identify a certain machine, this would show the duration of every previously performed work at this station.

Query

Filter

         FILTER LAG ( "Activities"."Timestamp" , PARTITION BY ( "Activities"."Activity" ) ) IS NOT NULL;
        

Filter

         FILTER "Activities"."Activity" = 'B';
        

Column1

         "Activities"."Case"
        

Column2

         HOURS_BETWEEN ( LAG ( "Activities"."Timestamp" , PARTITION BY ( "Activities"."Activity" ) ) , "Activities"."Timestamp" )
        

Input

Output

Activities

Case : int

Activity : string

Timestamp : date

1

'A'

Mon Feb 01 2016 01:00:00.000

1

'B'

Mon Feb 01 2016 02:00:00.000

2

'A'

Mon Feb 01 2016 02:00:00.000

2

'B'

Mon Feb 01 2016 03:00:00.000

3

'A'

Mon Feb 01 2016 04:00:00.000

3

'B'

Mon Feb 01 2016 06:00:00.000

Result

Column1 : int

Column2 : float

2

1.0

3

3.0

Error Cause

In case of errors occurring in processes, it is always interesting to identify the reason behind this error. With the usage of LEAD and LAG, it is easy to find the activities that happened before or after an error.

[4]

This example shows how the cause for an Error could be identified using LAG.

Query

Filter

         FILTER ( "Activities"."Activity" ) LIKE 'Error%';
        

Column1

         "Activities"."Activity"
        

Column2

         LAG ( "Activities"."Activity" )
        

Input

Output

Activities

Case : int

Activity : string

Timestamp : date

1

'A'

Mon Feb 01 2016 01:00:00.000

1

'B'

Mon Feb 01 2016 02:00:00.000

1

'Error X'

Mon Feb 01 2016 04:00:00.000

2

'A'

Mon Feb 01 2016 02:00:00.000

2

'Error Y'

Mon Feb 01 2016 03:00:00.000

2

'B'

Mon Feb 01 2016 04:00:00.000

3

'A'

Mon Feb 01 2016 04:00:00.000

3

'Error Z'

Mon Feb 01 2016 05:00:00.000

3

'B'

Mon Feb 01 2016 07:00:00.000

Result

Column1 : string

Column2 : string

'Error X'

'B'

'Error Y'

'A'

'Error Z'

'A'

Error Duration

Furthermore the duration of errors is also interesting with regard to delays in production or the duration of production in general. LEAD and LAG can be used to calculate the duration of errors.

[5]

This example calculates the duration of each station after an error occured with the usage of LAG.

Query

Column1

         CASE WHEN LAG ( "Activities"."Activity" , PARTITION BY ( "Activities"."Case" ) ) LIKE 'Error%' THEN HOURS_BETWEEN ( LAG ( "Activities"."Timestamp" , PARTITION BY ( "Activities"."Case" ) ) , "Activities"."Timestamp" ) ELSE null END
        

Input

Output

Activities

Case : int

Activity : string

Timestamp : date

1

'A'

Mon Feb 01 2016 01:00:00.000

1

'B'

Mon Feb 01 2016 02:00:00.000

1

'Error X'

Mon Feb 01 2016 04:00:00.000

2

'A'

Mon Feb 01 2016 02:00:00.000

2

'Error Y'

Mon Feb 01 2016 03:00:00.000

2

'B'

Mon Feb 01 2016 04:00:00.000

3

'A'

Mon Feb 01 2016 04:00:00.000

3

'Error Z'

Mon Feb 01 2016 05:00:00.000

3

'B'

Mon Feb 01 2016 07:00:00.000

Result

Column1 : float

null

null

null

null

null

1.0

null

null

2.0

[6]

In this example, LAG is used to calculate the average duration of an Error.

Query

Column1

         AVG ( CASE WHEN LAG ( "Activities"."Activity" , PARTITION BY ( "Activities"."Case" ) ) LIKE 'Error%' THEN HOURS_BETWEEN ( LAG ( "Activities"."Timestamp" , PARTITION BY ( "Activities"."Case" ) ) , "Activities"."Timestamp" ) ELSE null END )
        

Input

Output

Activities

Case : int

Activity : string

Timestamp : date

1

'A'

Mon Feb 01 2016 01:00:00.000

1

'B'

Mon Feb 01 2016 02:00:00.000

1

'Error X'

Mon Feb 01 2016 04:00:00.000

2

'A'

Mon Feb 01 2016 02:00:00.000

2

'Error Y'

Mon Feb 01 2016 03:00:00.000

2

'B'

Mon Feb 01 2016 04:00:00.000

3

'A'

Mon Feb 01 2016 04:00:00.000

3

'Error Z'

Mon Feb 01 2016 05:00:00.000

3

'B'

Mon Feb 01 2016 07:00:00.000

Result

Column1 : float

1.5