Skip to main content

Celonis Product Documentation

UNION_ALL
Description

The UNION_ALL function returns the vertical concatenation of columns.

Warning

Operator performance This operator allows up to eight input arguments. Please note that an increasing number of provided arguments can have a more significant negative impact on filter propagation performance. We therefore recommend to use UNION_ALL on as few input columns as possible.

SYNTAX
 UNION_ALL ( table.column1 , ... , table.columnN )

UNION_ALL requires at least 2 and at most 16 arguments.

NULL handling

NULL values are preserved.

Limitations

The current limitations documented in Engine Limitations also hold for the result of UNION_ALL. This means that the hard upper limit of rows for the result column of UNION_ALL is 2.1 billion rows, and the recommended limit of rows is 800 million rows.

The owning table of the input columns may not be temporary, i.e., constants and GLOBAL are not allowed.

Filter handling

Filters set on tables joined to the input tables to the UNION_ALL operator are projected to the input tables according to the standard filter propagation rules and then forwarded to the UNION_ALL table. In case the common table is not the UNION_ALL table but a table joined to it (like a domain table), the filter is further propagated according to the standard filter propagation rules.

Filters set on a UNION_ALL table are also projected back to the input tables. From there, the filters are further propagated to other connected tables. If a table is connected to multiple different input tables, the "distance" from it to the input tables determines how the projected UNION_ALL filters are propagated.

If a table t has the same distance from two (or more) different input tables, the (projected UNION_ALL) filters from each these tables are propagated to t and are or-ed together. If the distance is not the same, only the filters from the "closest" input tables are propagated. Filters from the input tables further away do not influence the shown values for t.

The "distance" is not directly the number of edges in the join graph, but the number of direction switches (from 1-to-n to n-to-1 and vice versa). In other words: longer runs of repeated 1-n joins only count as a distance of 1. The same holds for n-1 joins. So in the following example: "a 1-n t1 1-n t2 1-n t3 n-1 t4 n-1 t5 1-n b", the distance between the two tables a and b is 2 as there is a direction switch at t3 and at t5.

See below for an example of how this concretely works out in a simple example.

Please note that filter propagation through the Data Model might become less meaningful if input columns from many different tables are used inside UNION_ALL.

Nesting of UNION_ALL

UNION_ALL cannot be nested.

Examples

[1]

Concatenation of two string columns

Query

Column1

         UNION_ALL ( "caseTable"."orderid" , "activityTable"."activity" )
        

Input

Output

City

city : string

size : int

'Frankfurt'

3500000

'Boston'

600000

'Lyon'

2200000

'Seattle'

640000

Country

country : string

capitol : string

'DE'

'Berlin'

'US'

'Washington'

'FR'

'Paris'

OrderPos

orderId : string

pos : int

country : string

city : string

'001'

1

'DE'

'Frankfurt'

'001'

2

'US'

'Boston'

'001'

3

'FR'

'Lyon'

'002'

1

'US'

'Seattle'

activityTable

activity : string

caseId : int

'A'

1

'B'

1

'C'

2

caseTable

caseId : int

orderId : string

1

'001'

2

'002'

Foreign Keys

caseTable.caseId

activityTable.caseId

caseTable.orderId

OrderPos.orderId

OrderPos.country

Country.country

OrderPos.city

City.city

Result

Column1 : string

'001'

'002'

'A'

'B'

'C'

[2]

Filter on a UNION_ALL table affects the very same UNION_ALL table

Query

Filter

         FILTER UNION_ALL ( "caseTable"."orderid" , "activityTable"."activity" ) = '001';
        

Column1

         UNION_ALL ( "caseTable"."orderid" , "activityTable"."activity" )
        

Input

Output

City

city : string

size : int

'Frankfurt'

3500000

'Boston'

600000

'Lyon'

2200000

'Seattle'

640000

Country

country : string

capitol : string

'DE'

'Berlin'

'US'

'Washington'

'FR'

'Paris'

OrderPos

orderId : string

pos : int

country : string

city : string

'001'

1

'DE'

'Frankfurt'

'001'

2

'US'

'Boston'

'001'

3

'FR'

'Lyon'

'002'

1

'US'

'Seattle'

activityTable

activity : string

caseId : int

'A'

1

'B'

1

'C'

2

caseTable

caseId : int

orderId : string

1

'001'

2

'002'

Foreign Keys

caseTable.caseId

activityTable.caseId

caseTable.orderId

OrderPos.orderId

OrderPos.country

Country.country

OrderPos.city

City.city

Result

Column1 : string

'001'

[3]

We use the reverse order on the union tables to get another UNION table. The filters from the one table are propagated to the other table.

Query

Filter

         FILTER UNION_ALL ( "activityTable"."activity" , "caseTable"."orderid" ) = '001';
        

Column1

         UNION_ALL ( "caseTable"."orderid" , "activityTable"."activity" )
        

Input

Output

City

city : string

size : int

'Frankfurt'

3500000

'Boston'

600000

'Lyon'

2200000

'Seattle'

640000

Country

country : string

capitol : string

'DE'

'Berlin'

'US'

'Washington'

'FR'

'Paris'

OrderPos

orderId : string

pos : int

country : string

city : string

'001'

1

'DE'

'Frankfurt'

'001'

2

'US'

'Boston'

'001'

3

'FR'

'Lyon'

'002'

1

'US'

'Seattle'

activityTable

activity : string

caseId : int

'A'

1

'B'

1

'C'

2

caseTable

caseId : int

orderId : string

1

'001'

2

'002'

Foreign Keys

caseTable.caseId

activityTable.caseId

caseTable.orderId

OrderPos.orderId

OrderPos.country

Country.country

OrderPos.city

City.city

Result

Column1 : string

'001'

'A'

'B'

[4]

Filters on a UNION_ALL table are propagated to their origin. From the origin tables the filters are propagated to the rest of the data model.

Query

Filter

         FILTER UNION_ALL ( "activityTable"."activity" , "caseTable"."orderid" ) = '001';
        

Column1

         "activityTable"."activity"
        

Input

Output

City

city : string

size : int

'Frankfurt'

3500000

'Boston'

600000

'Lyon'

2200000

'Seattle'

640000

Country

country : string

capitol : string

'DE'

'Berlin'

'US'

'Washington'

'FR'

'Paris'

OrderPos

orderId : string

pos : int

country : string

city : string

'001'

1

'DE'

'Frankfurt'

'001'

2

'US'

'Boston'

'001'

3

'FR'

'Lyon'

'002'

1

'US'

'Seattle'

activityTable

activity : string

caseId : int

'A'

1

'B'

1

'C'

2

caseTable

caseId : int

orderId : string

1

'001'

2

'002'

Foreign Keys

caseTable.caseId

activityTable.caseId

caseTable.orderId

OrderPos.orderId

OrderPos.country

Country.country

OrderPos.city

City.city

Result

Column1 : string

'A'

'B'

[5]

If an origin table appears twice in the UNION_ALL table, one of the rows passing the filter on the UNION_ALL table is enough for it to be shown in the origin table.

Query

Filter

         FILTER UNION_ALL ( "activityTable"."activity" , TO_STRING ( "activityTable"."caseId" ) ) = 'A';
        

Column1

         "activityTable"."activity"
        

Column2

         "activityTable"."caseId"
        

Input

Output

City

city : string

size : int

'Frankfurt'

3500000

'Boston'

600000

'Lyon'

2200000

'Seattle'

640000

Country

country : string

capitol : string

'DE'

'Berlin'

'US'

'Washington'

'FR'

'Paris'

OrderPos

orderId : string

pos : int

country : string

city : string

'001'

1

'DE'

'Frankfurt'

'001'

2

'US'

'Boston'

'001'

3

'FR'

'Lyon'

'002'

1

'US'

'Seattle'

activityTable

activity : string

caseId : int

'A'

1

'B'

1

'C'

2

caseTable

caseId : int

orderId : string

1

'001'

2

'002'

Foreign Keys

caseTable.caseId

activityTable.caseId

caseTable.orderId

OrderPos.orderId

OrderPos.country

Country.country

OrderPos.city

City.city

Result

Column1 : string

Column2 : int

'A'

1

[6]

The next two examples (based on the same data model) show how the filter propagation works for tables that are connected to more than one UNION_ALL input table. For both examples the "activities" tables (on the n side) are connected by the same case table (on the 1 side)

The table "activities2" is itself an origin table, so the filter that directly comes from the UNION_ALL (distance 0) dominates the filter that first has to be propagated from the "activities2" table (distance 1).

Query

Filter

         FILTER UNION_ALL ( "activities1"."timestamp" , "activities2"."timestamp" ) = {d '2021-01-03' };
        

Column1

         "activities2"."caseId"
        

Column2

         "activities2"."activity"
        

Column3

         "activities2"."timestamp"
        

Input

Output

activities1

caseId : int

activity : string

timestamp : date

1

'A1'

Fri Jan 01 2021 00:00:00.000

1

'B1'

Sat Jan 02 2021 00:00:00.000

2

'C1'

Sun Jan 03 2021 00:00:00.000

activities2

caseId : int

activity : string

timestamp : date

1

'A2'

Fri Jan 01 2021 00:00:00.000

2

'B2'

Sat Jan 02 2021 00:00:00.000

3

'C2'

Sun Jan 03 2021 00:00:00.000

cases

caseId : int

orderId : string

1

'001'

2

'002'

3

'003'

Foreign Keys

cases.caseId

activities1.caseId

cases.caseId

activities2.caseId

Result

Column1 : int

Column2 : string

Column3 : date

3

'C2'

Sun Jan 03 2021 00:00:00.000

[7]

The next two examples (based on the same data model) show how the filter propagation works for tables that are connected to more than one UNION_ALL input table. For both examples the "activities" tables (on the n side) are connected by the same case table (on the 1 side)

The table "activities2" is itself an origin table, so the filter that directly comes from the UNION_ALL (distance 0) dominates the filter that first has to be propagated from the "activities2" table (distance 1).

Query

Filter

         FILTER UNION_ALL ( "activities1"."timestamp" , "activities2"."timestamp" ) = {d '2021-01-03' };
        

Column1

         "activities2"."caseId"
        

Column2

         "activities2"."activity"
        

Column3

         "activities2"."timestamp"
        

Input

Output

activities1

caseId : int

activity : string

timestamp : date

1

'A1'

Fri Jan 01 2021 00:00:00.000

1

'B1'

Sat Jan 02 2021 00:00:00.000

2

'C1'

Sun Jan 03 2021 00:00:00.000

activities2

caseId : int

activity : string

timestamp : date

1

'A2'

Fri Jan 01 2021 00:00:00.000

2

'B2'

Sat Jan 02 2021 00:00:00.000

3

'C2'

Sun Jan 03 2021 00:00:00.000

cases

caseId : int

orderId : string

1

'001'

2

'002'

3

'003'

Foreign Keys

cases.caseId

activities1.caseId

cases.caseId

activities2.caseId

Result

Column1 : int

Column2 : string

Column3 : date

3

'C2'

Sun Jan 03 2021 00:00:00.000