Defining a Time-based Property Transform Window

Property transform data collection windows can be defined either by a time range or by a number of data points. When defining a time-based window, understanding how the Flink engine measures time is important. Otherwise, it might seem as though the time-based Property Transform is providing incorrect results.

Flink is driven by the actual timestamps on your data points and not by the timestamps when the data points are received. So the Property Transform computation cannot rely on a timer to count down the time and launch at specific intervals. In addition, because data points can arrive slightly out of order, it is also not possible to use the first data point timestamp as a timer starting point.

Flink actually uses epoch time to define time windows. The epoch time at the start of the first data window is timestamp 0. So the first window will range from epoch time to epoch time + window size. If the window size is 5000 milliseconds (5 seconds), the first window will range from 0 – 4999 milliseconds. Future windows will be n number of steps from 5000, depending on the size of the calculation interval. To see this concept expressed as an equation, see the following:

One final point about timing is important for understanding Flink calculations. In a time-based window, Flink defines the lower bound of the window as inclusive and the upper bound as exclusive. For example, in a window ranging from 2:00 – 2:10, a point with a timestamp of exactly 2:00 is considered part of the window. But a point with a timestamp of exactly 2:10 is not. If data points are received at 2:02, 2:04, and 2:08, the calculation for that time window will not be triggered until a datapoint arrives with a timestamp of 2:10 or above. This behavior may seem odd, but remember that Flink is driven by the timestamps on the data points. The Flink engine does not know that the window of time has concluded until a point comes in from outside of that range. That point from outside the range is not included in the calculation. It only acts as the trigger for Flink to launch the calculation. See the example in the image below.

For more detailed information about how Flink handles time-based windows, see Event Time and other resources on the Flink website.