Search code examples
apache-flink

What‘s the practical use of DataStream#assignAscendingTimestamps


The javadoc for the DataStream#assignAscendingTimestamps

* Assigns timestamps to the elements in the data stream and periodically creates * watermarks to signal event time progress. * * This method is a shortcut for data streams where the element timestamp are known * to be monotonously ascending within each parallel stream. * In that case, the system can generate watermarks automatically and perfectly * by tracking the ascending timestamps.

This method assumes that the the element timestamp are known to be monotonously ascending within each parallel stream. But in practice, almost no stream can give such guarantee that event timestamps are in ascending order.

I would like to conclude that this method should never be used,but I would ask if I have missed something(eg, when to use it)


Solution

  • generally I agree, it can be rarely used in practice. An exception is the following: If Kafka is used as a source with LogAppendTime, timestamp are in order per-partition. You can then use per-partition watermarking in Flink [1] with the AscendingTimestampExtractor and will have pretty optimal watermarking.

    Cheers,

    Konstantin

    [1] https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/connectors/kafka.html#kafka-consumers-and-timestamp-extractionwatermark-emission