Skip to main content

Understanding Differences Between First-Generation and Premium Process Query Engine Behavior

Description

As part of our efforts to enable scalability and support for more complex, data-intensive processes, we are transitioning from the first-generation engine to a new engine architecture. The Premium Process Query Engine lays the groundwork for handling larger data volumes while maintaining reliable performance and responsiveness.

This shift may lead to differences in query results due to updated logic and computation methods that align with modern standards and ensure optimal performance. This document outlines the most significant and commonly encountered differences, along with guidance on how to validate results and adjust queries when needed to match the behavior of the previous engine.

Please note that other, more subtle differences may exist and are not exhaustively covered here.

Difference in NOT Filter Behavior for NULL values

Description

The behavior of the NOT operator in filters differs due to how each engine handles NULL values in logical expressions.

First-generation Engine

Premium Engine

Logic Behaviour

Two-valued logic (TRUE or FALSE)

Three-valued logic (TRUE, FALSE, or UNKNOWN/NULL)

NULL in conditions

Treated as FALSE

Evaluated as UNKNOWN (excluded when negated)

Effect of NOT

NULL values are included in result

NULL values are excluded from result

Example

Query

FILTER NOT ("mytable"."id" = 1);
  • First-generation engine: Returns all rows where id is not 1, including rows where id is NULL.

  • Premium Engine: Returns only rows where id is not 1, but excludes rows where id is NULL.

Recommendation

If your data may include NULLs and you want to preserve first-generation engine behavior, adjust the filter to explicitly handle them:

Query

FILTER ( "mytable"."id" IS NULL OR "mytable"."id" != 1 );

This explicitly includes rows where id is null or not equal to 1, maintaining compatibility with the previous logic.

Difference in QUANTILE / MEDIAN Approximation

Description

In the first-generation engine, all aggregations - including MEDIAN and QUANTILE - were always computed using exact methods.

In contrast, the Premium Engine performs exact aggregations by default for all operators except MEDIAN and QUANTILE, which use approximate algorithms to improve scalability and performance. While this enhances speed, it may lead to minor deviations from exact results, particularly on small or edge-case datasets.

To maintain full control, the new engine provides the ability to explicitly choose between exact and approximate modes, allowing users to balance accuracy and performance based on their specific use case.

Example

MEDIAN (approximate) -> uses QUANTILE 0.5, which uses APPROX

Dataset: [1, 1, 2, 3, 4, 5]

Quantile: 0.5 ( = MEDIAN)

  • First-generation engine: 3

  • Premium Engine: (APPROX, default): 2

  • Premium Engine: (EXACT): 3

Recommendation

By default, the APPROX operator is used for both functions. If you require the EXACT logic for cases where precision is critical - particularly for KPIs that depend on statistical accuracy - please contact our support team. We will be happy to assist you.