Search code examples
tdengine

How does it compare to other real-time data processing systems like Apache Kafka or Apache Flink?


Can TDengine support real-time data streaming and processing? How does it compare to other real-time data processing systems like Apache Kafka or Apache Flink? In practice, we generate large real-time data every day, encounter real-time data streams, and need to store data and business processing, how does it compare to other real-time data processing systems such as Apache Kafka or Apache Flink? Is there a specific detailed solution reference?

In practice, we generate large real-time data every day, encounter real-time data streams, and need to store data and business processing, how does it compare to other real-time data processing systems such as Apache Kafka or Apache Flink? Is there a specific detailed solution reference?


Solution

  • With the stream processing engine built into TDengine, you can process incoming data streams in real time and define stream transformations in SQL. Incoming data is automatically processed, and the results are pushed to specified tables based on triggering rules that you define. This is a lightweight alternative to complex processing engines that returns computation results in milliseconds even in high throughput scenarios.

    The stream processing engine includes data filtering, scalar function computation (including user-defined functions), and window aggregation, with support for sliding windows, session windows, and event windows. Stream processing can write data to supertables from other supertables, standard tables, or subtables. When you create a stream, the target supertable is automatically created. New data is then processed and written to that supertable according to the rules defined for the stream. You can use PARTITION BY statements to partition the data by table name or tag. Separate partitions are then written to different subtables within the target supertable.

    TDengine stream processing supports the aggregation of supertables that are deployed across multiple vnodes. It can also handle out-of-order writes and includes a watermark mechanism that determines the extent to which out-of-order data is accepted by the system. You can configure whether to drop or reprocess out-of-order data through the ignore expired parameter.

    The API style of TDengine stream is like Kafka.