Search code examples
apache-kafkahiveapache-flinkflink-streaming

I want to make streaming job in Apache Flink to do Kafka -> Flink -> HIVE


I want to make streaming job in Apache Flink to do Kafka -> Flink -> HIVE in Apache Flink(Scala). Can anyone please give code sample as their official document is not very clear to understand.

This should be streaming process.


Solution

  • For help getting started with the Table API, Real Time Reporting with the Table API is a tutorial you can follow. It's in Java, but the Scala API isn't much different.

    This is an example of using SQL to read from Kafka and write to Hive. To do the same from Scala you can wrap the SQL statements with tableEnv.executeSql(...), as in

    tableEnv.executeSql("CREATE TABLE Orders (`user` BIGINT, product STRING, amount INT) WITH (...)")
    

    or

    val tableResult1 = tEnv.executeSql("INSERT INTO ...")
    

    If you need to do multiple inserts, then you'll need to do it a bit differently, using a StatementSet. See the docs linked to below for details.

    See Run a CREATE statement, Run an INSERT statement, Apache Kafka SQL Connector, and Writing to Hive.

    If you get stuck, show us what you've tried and how it is failing.