I have a use case to deduplicate the data using Table API (while streaming the data from one source to another sink). This documentation looks very clear for such use case. But what I don't understand is that, how the state management works in this case? When the state would clean internally? For example, when I receive duplicate order_id after a couple weeks, whether it remove internally as it is duplicate but processing after 14 days? Or in other words, how frequently the SQL clears its state? This is not mentioned in the same page which I am trying to understand. Perhaps I have to relate some other concepts for this?
The state wouldn't clear internally unless you explicitly define so. This is explained in more detail in the documentation on State Management, which you can find at https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/dev/table/concepts/overview/#state-usage