I have a table with data and I need to make a join by two fields.
I wrote a request, but it does not work
SELECT *
FROM Data t1
JOIN Data t2 ON t1.s = t2.o
the code is
val csvTableSource = CsvTableSource
.builder
.path("src/main/resources/data.dat")
.field("s", Types.STRING)
.field("p", Types.STRING)
.field("o", Types.STRING)
.field("TIMESTAMP", Types.STRING)
.fieldDelimiter(",")
.ignoreFirstLine
.ignoreParseErrors
.commentPrefix("%")
.build()
tableEnv.registerTableSource("Data", csvTableSource)
val query = "SELECT * FROM Data t1 JOIN Data t2 ON t1.s = t2.o"
val table = tableEnv.sqlQuery(query)
I get the following exception
Exception in thread "main" org.apache.flink.table.api.TableException: Cannot generate a valid execution plan for the given query:
FlinkLogicalJoin(condition=[=($0, $6)], joinType=[inner])
FlinkLogicalTableSourceScan(table=[[Data]], fields=[s, p, o, TIMESTAMP], source=[CsvTableSource(read fields: s, p, o, TIMESTAMP)])
FlinkLogicalTableSourceScan(table=[[Data]], fields=[s, p, o, TIMESTAMP], source=[CsvTableSource(read fields: s, p, o, TIMESTAMP)])
This exception indicates that the query uses an unsupported SQL feature.
Please check the documentation for the set of currently supported SQL features.
I guess, you are trying to run this query in a streaming environment. Non-windowed joins on streaming tables were added with Flink 1.5.0.
So you are trying to use a feature that is not supported in Flink 1.4.2 yet.
You can either switch to a batch environment which should be possible given that you are reading CSV files or upgrade to Flink 1.5.0.