Skip to main content

Celonis Product Documentation

Multiple Invoices per Case

Description

This example shows how to calculate the total number of days spent between creating and clearing multiple invoices of one purchase order.

In the Purchase-to-Pay process (P2P), there can be multiple invoices per case (that is, per Purchase Order Item). One example would be the payment by installments of one item. The 'Record Invoice Receipt' activity marks the receipt of an invoice, and 'Clear Invoice' marks the payment of the invoice.

Challenge

For each case, how do you calculate the total number of days spent between matching pairs of 'Record Invoice Receipt' and 'Clear Invoice' activities? The invoice number to which those activities belong is given as a separate column in the activity table.

Solution

In PQL, this question can be answered as follows:

[1]

The total number of days spent between 'Record Invoice Receipt' and 'Clear Invoice' activities per case can be solved using the following query. In the example, the value 10 for case 1 is calculated from the number of days between those two activities related to the invoice number 100 (which is 8 days) plus the days between those two activities related to the invoice number 300 (which is 2 days). In case 2, there is only one pair of 'Record Invoice Receipt' and 'Clear Invoice' activities belonging together, which have 3 days between their occurrences.

Query

Column1

         PU_FIRST ( DOMAIN_TABLE ( "Activities"."CASE" , "Activities"."INVOICE_NUMBER" ) , "Activities"."CASE" )
        

Column2

         SUM ( DAYS_BETWEEN ( PU_FIRST ( DOMAIN_TABLE ( "Activities"."CASE" , "Activities"."INVOICE_NUMBER" ) , "Activities"."EVENTTIME" , "Activities"."ACTIVITY" = 'Record Invoice Receipt' ) , PU_LAST ( DOMAIN_TABLE ( "Activities"."CASE" , "Activities"."INVOICE_NUMBER" ) , "Activities"."EVENTTIME" , "Activities"."ACTIVITY" = 'Clear Invoice' ) ) )
        

Input

Output

Activities

CASE : int

ACTIVITY : string

INVOICE_NUMBER : int

EVENTTIME : date

1

'X'

null

Tue Jan 01 2019 13:00:00.000

1

'Record Invoice Receipt'

100

Wed Jan 02 2019 13:00:00.000

1

'Record Invoice Receipt'

200

Thu Jan 03 2019 13:00:00.000

1

'Record Invoice Receipt'

300

Fri Jan 04 2019 13:00:00.000

1

'Clear Invoice'

300

Sun Jan 06 2019 13:00:00.000

1

'Clear Invoice'

100

Thu Jan 10 2019 13:00:00.000

2

'Record Invoice Receipt'

100

Mon Jan 21 2019 13:00:00.000

2

'Y'

100

Tue Jan 22 2019 13:00:00.000

2

'Clear Invoice'

100

Thu Jan 24 2019 13:00:00.000

Result

Column1 : int

Column2 : float

1

10.0

2

3.0

Step-by-Step Solution

Let's start developing the solution by solving a more simpler question: How can we calculate the number of days between the first 'Record Invoice Receipt' and the last 'Clear Invoice' activities of each case (not taking the invoice number into account)? The timestamps of those activities can be retrieved by using PU_FIRST and PU_LAST, respectively. The number of days between the resulting timestamps can be calculated using the DAYS_BETWEEN functionality.

[2]

In the example eventlog below, the PU_FIRST expression returns the timestamp of the first 'Record Invoice Receipt' of each case, while the PU_LAST expression returns the timestamp of the last 'Clear Invoice' activity of each case. The number of days between those two dates (the time between the first 'Record Invoice Receipt' and the last 'Clear Invoice' activity) is 8 days for case 1, and 3 days for case 2:

Query

Column1

         "Cases"."CASE"
        

Column2

         DAYS_BETWEEN ( PU_FIRST ( "Cases" , "Activities"."EVENTTIME" , "Activities"."ACTIVITY" = 'Record Invoice Receipt' ) , PU_LAST ( "Cases" , "Activities"."EVENTTIME" , "Activities"."ACTIVITY" = 'Clear Invoice' ) )
        

Input

Output

Activities

CASE : int

ACTIVITY : string

EVENTTIME : date

1

'X'

Tue Jan 01 2019 13:00:00.000

1

'Record Invoice Receipt'

Wed Jan 02 2019 13:00:00.000

1

'Record Invoice Receipt'

Thu Jan 03 2019 13:00:00.000

1

'Record Invoice Receipt'

Fri Jan 04 2019 13:00:00.000

1

'Clear Invoice'

Sun Jan 06 2019 13:00:00.000

1

'Clear Invoice'

Thu Jan 10 2019 13:00:00.000

2

'Record Invoice Receipt'

Mon Jan 21 2019 13:00:00.000

2

'Y'

Tue Jan 22 2019 13:00:00.000

2

'Clear Invoice'

Thu Jan 24 2019 13:00:00.000

Cases

CASE : int

1

2

Foreign Keys

Activities.CASE

Cases.CASE

Result

Column1 : int

Column2 : float

1

8.0

2

3.0

When adding columns from the activity table to the query and thereby joining our previous result to the activity table, the DAYS_BETWEEN result of a case appears next to every activity of that case.

[3]

In the example, all rows of each case are now annotated with the number of days between the first 'Record Invoice Receipt' and the last 'Clear Invoice' activity of that case (which is 8 days for case 1, and 3 days for case 2).

Query

Column1

         "Activities"."CASE"
        

Column2

         "Activities"."ACTIVITY"
        

Column3

         "Activities"."INVOICE_NUMBER"
        

Column4

         DAYS_BETWEEN ( PU_FIRST ( "Cases" , "Activities"."EVENTTIME" , "Activities"."ACTIVITY" = 'Record Invoice Receipt' ) , PU_LAST ( "Cases" , "Activities"."EVENTTIME" , "Activities"."ACTIVITY" = 'Clear Invoice' ) )
        

Input

Output

Activities

CASE : int

ACTIVITY : string

INVOICE_NUMBER : int

EVENTTIME : date

1

'X'

null

Tue Jan 01 2019 13:00:00.000

1

'Record Invoice Receipt'

100

Wed Jan 02 2019 13:00:00.000

1

'Record Invoice Receipt'

200

Thu Jan 03 2019 13:00:00.000

1

'Record Invoice Receipt'

300

Fri Jan 04 2019 13:00:00.000

1

'Clear Invoice'

300

Sun Jan 06 2019 13:00:00.000

1

'Clear Invoice'

100

Thu Jan 10 2019 13:00:00.000

2

'Record Invoice Receipt'

100

Mon Jan 21 2019 13:00:00.000

2

'Y'

100

Tue Jan 22 2019 13:00:00.000

2

'Clear Invoice'

100

Thu Jan 24 2019 13:00:00.000

Cases

CASE : int

1

2

Foreign Keys

Activities.CASE

Cases.CASE

Result

Column1 : int

Column2 : string

Column3 : int

Column4 : float

1

'X'

null

8.0

1

'Record Invoice Receipt'

100

8.0

1

'Record Invoice Receipt'

200

8.0

1

'Record Invoice Receipt'

300

8.0

1

'Clear Invoice'

300

8.0

1

'Clear Invoice'

100

8.0

2

'Record Invoice Receipt'

100

3.0

2

'Y'

100

3.0

2

'Clear Invoice'

100

3.0

To get the times between every pair of 'Record Invoice Receipt' and 'Clear Invoice' activities with the same invoice number, it is not enough to calculate the PU aggregations only for each case. Instead, we need to calculate them for each unique pair of CASE and INVOICE_NUMBER values. This can be achieved by creating a temporary table with DOMAIN_TABLE, which contains those unique pairs of CASE and INVOICE_NUMBER values. The PU functions then return the according values for each of those pairs.

[4]

Let's take a look at case 1 in the example data. The number of days between the 'Record Invoice Receipt' of invoice 100 and 'Clear Invoice' of invoice 100 is 8 days, which is why both activities are annotated with a 8 in the result. 'Record Invoice Receipt' of invoice 200 does not have a corresponding 'Clear Invoice', which gives a NULL result (because the PU_LAST expression returns NULL in that case). Between the activity pair of invoice 300, the difference is 2 days. In case 2, only one activity pair is contained (invoice 100). The number of days between them (3) is returned for all rows with the invoice number 100 in case 2.

Query

Column1

         "Activities"."CASE"
        

Column2

         "Activities"."ACTIVITY"
        

Column3

         "Activities"."INVOICE_NUMBER"
        

Column4

         DAYS_BETWEEN ( PU_FIRST ( DOMAIN_TABLE ( "Activities"."CASE" , "Activities"."INVOICE_NUMBER" ) , "Activities"."EVENTTIME" , "Activities"."ACTIVITY" = 'Record Invoice Receipt' ) , PU_LAST ( DOMAIN_TABLE ( "Activities"."CASE" , "Activities"."INVOICE_NUMBER" ) , "Activities"."EVENTTIME" , "Activities"."ACTIVITY" = 'Clear Invoice' ) )
        

Input

Output

Activities

CASE : int

ACTIVITY : string

INVOICE_NUMBER : int

EVENTTIME : date

1

'X'

null

Tue Jan 01 2019 13:00:00.000

1

'Record Invoice Receipt'

100

Wed Jan 02 2019 13:00:00.000

1

'Record Invoice Receipt'

200

Thu Jan 03 2019 13:00:00.000

1

'Record Invoice Receipt'

300

Fri Jan 04 2019 13:00:00.000

1

'Clear Invoice'

300

Sun Jan 06 2019 13:00:00.000

1

'Clear Invoice'

100

Thu Jan 10 2019 13:00:00.000

2

'Record Invoice Receipt'

100

Mon Jan 21 2019 13:00:00.000

2

'Y'

100

Tue Jan 22 2019 13:00:00.000

2

'Clear Invoice'

100

Thu Jan 24 2019 13:00:00.000

Result

Column1 : int

Column2 : string

Column3 : int

Column4 : float

1

'X'

null

null

1

'Record Invoice Receipt'

100

8.0

1

'Record Invoice Receipt'

200

null

1

'Record Invoice Receipt'

300

2.0

1

'Clear Invoice'

300

2.0

1

'Clear Invoice'

100

8.0

2

'Record Invoice Receipt'

100

3.0

2

'Y'

100

3.0

2

'Clear Invoice'

100

3.0

The goal is to eventually sum up all the values calculated by the DAYS_BETWEEN function per case. However, the number of days between each pair of 'Record Invoice Receipt' and 'Clear Invoice' activities always appears multiple times in the column: It appears next to every row in the activity table with the corresponding case and invoice number due to the join with the activity table. In order to only have it once per activity pair, we can set every time difference value except for those of the 'Record Invoice Receipt' activity to 0 (or NULL alternatively) by using a CASE WHEN statement.

[5]

Now, the number of days for each case and invoice only appears one in the column (8 days for invoice 100 in case 1, 2 days for invoice 300 in case 1, and 3 days for invoice 100 in case 2):

Query

Column1

         "Activities"."CASE"
        

Column2

         "Activities"."ACTIVITY"
        

Column3

         "Activities"."INVOICE_NUMBER"
        

Column4

         CASE WHEN "Activities"."ACTIVITY" = 'Record Invoice Receipt' THEN DAYS_BETWEEN ( PU_FIRST ( DOMAIN_TABLE ( "Activities"."CASE" , "Activities"."INVOICE_NUMBER" ) , "Activities"."EVENTTIME" , "Activities"."ACTIVITY" = 'Record Invoice Receipt' ) , PU_LAST ( DOMAIN_TABLE ( "Activities"."CASE" , "Activities"."INVOICE_NUMBER" ) , "Activities"."EVENTTIME" , "Activities"."ACTIVITY" = 'Clear Invoice' ) ) ELSE 0.0 END
        

Input

Output

Activities

CASE : int

ACTIVITY : string

INVOICE_NUMBER : int

EVENTTIME : date

1

'X'

null

Tue Jan 01 2019 13:00:00.000

1

'Record Invoice Receipt'

100

Wed Jan 02 2019 13:00:00.000

1

'Record Invoice Receipt'

200

Thu Jan 03 2019 13:00:00.000

1

'Record Invoice Receipt'

300

Fri Jan 04 2019 13:00:00.000

1

'Clear Invoice'

300

Sun Jan 06 2019 13:00:00.000

1

'Clear Invoice'

100

Thu Jan 10 2019 13:00:00.000

2

'Record Invoice Receipt'

100

Mon Jan 21 2019 13:00:00.000

2

'Y'

100

Tue Jan 22 2019 13:00:00.000

2

'Clear Invoice'

100

Thu Jan 24 2019 13:00:00.000

Result

Column1 : int

Column2 : string

Column3 : int

Column4 : float

1

'X'

null

0.0

1

'Record Invoice Receipt'

100

8.0

1

'Record Invoice Receipt'

200

null

1

'Record Invoice Receipt'

300

2.0

1

'Clear Invoice'

300

0.0

1

'Clear Invoice'

100

0.0

2

'Record Invoice Receipt'

100

3.0

2

'Y'

100

0.0

2

'Clear Invoice'

100

0.0

Now we can calculate the total number of days spent between 'Record Invoice Receipt' and 'Clear Invoice' activities for each case by summing up the values of the column we created per case.

[6]

In case 1, the 8 days of invoice 100 and the 2 days of invoice 300 sum up to 10. In case 2, the 3 days for clearing invoice 100 make up the final solution.

Query

Column1

         "Activities"."CASE"
        

Column2

         SUM ( CASE WHEN "Activities"."ACTIVITY" = 'Record Invoice Receipt' THEN DAYS_BETWEEN ( PU_FIRST ( DOMAIN_TABLE ( "Activities"."CASE" , "Activities"."INVOICE_NUMBER" ) , "Activities"."EVENTTIME" , "Activities"."ACTIVITY" = 'Record Invoice Receipt' ) , PU_LAST ( DOMAIN_TABLE ( "Activities"."CASE" , "Activities"."INVOICE_NUMBER" ) , "Activities"."EVENTTIME" , "Activities"."ACTIVITY" = 'Clear Invoice' ) ) ELSE 0.0 END )
        

Input

Output

Activities

CASE : int

ACTIVITY : string

INVOICE_NUMBER : int

EVENTTIME : date

1

'X'

null

Tue Jan 01 2019 13:00:00.000

1

'Record Invoice Receipt'

100

Wed Jan 02 2019 13:00:00.000

1

'Record Invoice Receipt'

200

Thu Jan 03 2019 13:00:00.000

1

'Record Invoice Receipt'

300

Fri Jan 04 2019 13:00:00.000

1

'Clear Invoice'

300

Sun Jan 06 2019 13:00:00.000

1

'Clear Invoice'

100

Thu Jan 10 2019 13:00:00.000

2

'Record Invoice Receipt'

100

Mon Jan 21 2019 13:00:00.000

2

'Y'

100

Tue Jan 22 2019 13:00:00.000

2

'Clear Invoice'

100

Thu Jan 24 2019 13:00:00.000

Result

Column1 : int

Column2 : float

1

10.0

2

3.0

Since we want to see the Case ID in the result table, we had to join our throughput times to the activity table. That means we had to use a CASE WHEN in order to only sum up each value once. However, we could also avoid the join to the activity table by retrieving the Case ID in a PU function, using the same DOMAIN_TABLE as in the PU functions in the DAYS_BETWEEN statement. In the DOMAIN_TABLE, all distinct groups of case and invoice numbers are contained. Let's look at this without summing up all throughput times per case first. To get context for the numbers, we also include the invoice number using the same technique.

[7]

Each unique pair of case and invoice number leads to one row in the result table. The throughput times for each case and invoice are the same as in the examples above.

Query

Column1

         PU_FIRST ( DOMAIN_TABLE ( "Activities"."CASE" , "Activities"."INVOICE_NUMBER" ) , "Activities"."CASE" )
        

Column2

         PU_FIRST ( DOMAIN_TABLE ( "Activities"."CASE" , "Activities"."INVOICE_NUMBER" ) , "Activities"."INVOICE_NUMBER" )
        

Column3

         DAYS_BETWEEN ( PU_FIRST ( DOMAIN_TABLE ( "Activities"."CASE" , "Activities"."INVOICE_NUMBER" ) , "Activities"."EVENTTIME" , "Activities"."ACTIVITY" = 'Record Invoice Receipt' ) , PU_LAST ( DOMAIN_TABLE ( "Activities"."CASE" , "Activities"."INVOICE_NUMBER" ) , "Activities"."EVENTTIME" , "Activities"."ACTIVITY" = 'Clear Invoice' ) )
        

Input

Output

Activities

CASE : int

ACTIVITY : string

INVOICE_NUMBER : int

EVENTTIME : date

1

'X'

null

Tue Jan 01 2019 13:00:00.000

1

'Record Invoice Receipt'

100

Wed Jan 02 2019 13:00:00.000

1

'Record Invoice Receipt'

200

Thu Jan 03 2019 13:00:00.000

1

'Record Invoice Receipt'

300

Fri Jan 04 2019 13:00:00.000

1

'Clear Invoice'

300

Sun Jan 06 2019 13:00:00.000

1

'Clear Invoice'

100

Thu Jan 10 2019 13:00:00.000

2

'Record Invoice Receipt'

100

Mon Jan 21 2019 13:00:00.000

2

'Y'

100

Tue Jan 22 2019 13:00:00.000

2

'Clear Invoice'

100

Thu Jan 24 2019 13:00:00.000

Result

Column1 : int

Column2 : int

Column3 : float

1

null

null

1

100

8.0

1

200

null

1

300

2.0

2

100

3.0

Now we can sum up the throughput time values to get the final result. As we want to sum them up for each case, we need to remove the invoice number dimension again.

[8]

In case 1, the 8 days of invoice 100 and the 2 days of invoice 300 sum up to 10. In case 2, the 3 days for clearing invoice 100 make up the final solution.

Query

Column1

         PU_FIRST ( DOMAIN_TABLE ( "Activities"."CASE" , "Activities"."INVOICE_NUMBER" ) , "Activities"."CASE" )
        

Column2

         SUM ( DAYS_BETWEEN ( PU_FIRST ( DOMAIN_TABLE ( "Activities"."CASE" , "Activities"."INVOICE_NUMBER" ) , "Activities"."EVENTTIME" , "Activities"."ACTIVITY" = 'Record Invoice Receipt' ) , PU_LAST ( DOMAIN_TABLE ( "Activities"."CASE" , "Activities"."INVOICE_NUMBER" ) , "Activities"."EVENTTIME" , "Activities"."ACTIVITY" = 'Clear Invoice' ) ) )
        

Input

Output

Activities

CASE : int

ACTIVITY : string

INVOICE_NUMBER : int

EVENTTIME : date

1

'X'

null

Tue Jan 01 2019 13:00:00.000

1

'Record Invoice Receipt'

100

Wed Jan 02 2019 13:00:00.000

1

'Record Invoice Receipt'

200

Thu Jan 03 2019 13:00:00.000

1

'Record Invoice Receipt'

300

Fri Jan 04 2019 13:00:00.000

1

'Clear Invoice'

300

Sun Jan 06 2019 13:00:00.000

1

'Clear Invoice'

100

Thu Jan 10 2019 13:00:00.000

2

'Record Invoice Receipt'

100

Mon Jan 21 2019 13:00:00.000

2

'Y'

100

Tue Jan 22 2019 13:00:00.000

2

'Clear Invoice'

100

Thu Jan 24 2019 13:00:00.000

Result

Column1 : int

Column2 : float

1

10.0

2

3.0

As a bonus, we can also calculate the total number of days spent between 'Record Invoice Receipt' and 'Clear Invoice' activities over all cases. The CASE WHEN statement is not required here because the result is not joined to the activity or case table.

[9]

The total number of days is 13 (the 10 days of case 1 plus the 3 days of case 2):

Query

Column1

         SUM ( DAYS_BETWEEN ( PU_FIRST ( DOMAIN_TABLE ( "Activities"."CASE" , "Activities"."INVOICE_NUMBER" ) , "Activities"."EVENTTIME" , "Activities"."ACTIVITY" = 'Record Invoice Receipt' ) , PU_LAST ( DOMAIN_TABLE ( "Activities"."CASE" , "Activities"."INVOICE_NUMBER" ) , "Activities"."EVENTTIME" , "Activities"."ACTIVITY" = 'Clear Invoice' ) ) )
        

Input

Output

Activities

CASE : int

ACTIVITY : string

INVOICE_NUMBER : int

EVENTTIME : date

1

'X'

null

Tue Jan 01 2019 13:00:00.000

1

'Record Invoice Receipt'

100

Wed Jan 02 2019 13:00:00.000

1

'Record Invoice Receipt'

200

Thu Jan 03 2019 13:00:00.000

1

'Record Invoice Receipt'

300

Fri Jan 04 2019 13:00:00.000

1

'Clear Invoice'

300

Sun Jan 06 2019 13:00:00.000

1

'Clear Invoice'

100

Thu Jan 10 2019 13:00:00.000

2

'Record Invoice Receipt'

100

Mon Jan 21 2019 13:00:00.000

2

'Y'

100

Tue Jan 22 2019 13:00:00.000

2

'Clear Invoice'

100

Thu Jan 24 2019 13:00:00.000

Result

Column1 : float

13.0

If no INVOICE_NUMBER column is given in the activity table to group corresponding activities, we could assume that the 'Clear Invoice' activities appear in the same order as the corresponding 'Record Invoice Receipt' activities. So for each case, we could calculate the number of days between the first 'Record Invoice Receipt' and the first 'Clear Invoice', between the second 'Record Invoice Receipt' and the second 'Clear Invoice', and so on using the INDEX_ORDER functionality:

[10]

In case 1, the number of days between the first 'Record Invoice Receipt' and the first 'Clear Invoice' is 4 days, and the number of days between the second 'Record Invoice Receipt' and the second 'Clear Invoice' is 7 days, which sums up to 11 days. There is no corresponding third 'Clear Invoice' for the third 'Record Invoice Receipt'. In case 2, there is again only one pair of 'Record Invoice Receipt' and 'Clear Invoice' activities belonging together, which leads to the same result as before:

Query

Column1

         "Activities"."CASE"
        

Column2

         SUM ( CASE WHEN "Activities"."ACTIVITY" = 'Record Invoice Receipt' THEN DAYS_BETWEEN ( PU_FIRST ( DOMAIN_TABLE ( "Activities"."CASE" , INDEX_ORDER ( "Activities"."ACTIVITY" , GROUP ( "Activities"."CASE" , "Activities"."ACTIVITY" ) ) ) , "Activities"."EVENTTIME" , "Activities"."ACTIVITY" = 'Record Invoice Receipt' ) , PU_LAST ( DOMAIN_TABLE ( "Activities"."CASE" , INDEX_ORDER ( "Activities"."ACTIVITY" , GROUP ( "Activities"."CASE" , "Activities"."ACTIVITY" ) ) ) , "Activities"."EVENTTIME" , "Activities"."ACTIVITY" = 'Clear Invoice' ) ) ELSE 0.0 END )
        

Input

Output

Activities

CASE : int

ACTIVITY : string

EVENTTIME : date

1

'X'

Tue Jan 01 2019 13:00:00.000

1

'Record Invoice Receipt'

Wed Jan 02 2019 13:00:00.000

1

'Record Invoice Receipt'

Thu Jan 03 2019 13:00:00.000

1

'Record Invoice Receipt'

Fri Jan 04 2019 13:00:00.000

1

'Clear Invoice'

Sun Jan 06 2019 13:00:00.000

1

'Clear Invoice'

Thu Jan 10 2019 13:00:00.000

2

'Record Invoice Receipt'

Mon Jan 21 2019 13:00:00.000

2

'Y'

Tue Jan 22 2019 13:00:00.000

2

'Clear Invoice'

Thu Jan 24 2019 13:00:00.000

Cases

CASE : int

1

2

Foreign Keys

Activities.CASE

Cases.CASE

Result

Column1 : int

Column2 : float

1

11.0

2

3.0