i'm trying to recreate the flink common example working with hudi (https://hudi.apache.org/docs/flink-quick-start-guide), but when I try to insert the example data an error appears, can someone help me with this?
The steps that I'm following in my AWS EMR cluster are:
export JVM_ARGS=-Djava.io.tmpdir=/mnt/tmp
sudo aws s3 cp MyBucketLocation/hudi-flink-bundle_2.11-0.10.0.jar /lib/flink/lib/hudi-flink-bundle_2.11-0.10.0.jar
#Init the Sql cli flink
/usr/lib/flink/bin/sql-client.sh
--Create table
CREATE TABLE t1(
uuid VARCHAR(20) PRIMARY KEY NOT ENFORCED,
name VARCHAR(10),
age INT,
ts TIMESTAMP(3),
`partition` VARCHAR(20)
)
PARTITIONED BY (`partition`)
WITH (
'connector' = 'hudi',
'path' = 's3://issue-lmdl-s3-ldz/msk/Flink/kafka/',
'table.type' = 'MERGE_ON_READ' -- this creates a MERGE_ON_READ table, by default is COPY_ON_WRITE
);
--Insert as the documentation
INSERT INTO t1 VALUES
('id1','Danny',23,TIMESTAMP '1970-01-01 00:00:01','par1'),
('id2','Stephen',33,TIMESTAMP '1970-01-01 00:00:02','par1'),
('id3','Julian',53,TIMESTAMP '1970-01-01 00:00:03','par2'),
('id4','Fabian',31,TIMESTAMP '1970-01-01 00:00:04','par2'),
('id5','Sophia',18,TIMESTAMP '1970-01-01 00:00:05','par3'),
('id6','Emma',20,TIMESTAMP '1970-01-01 00:00:06','par3'),
('id7','Bob',44,TIMESTAMP '1970-01-01 00:00:07','par4'),
('id8','Han',56,TIMESTAMP '1970-01-01 00:00:08','par4');
I'm working with EMR 6.8.0 and sql cli flink has already worked with kafka, I just want to write this records in hudi format.
It's a version problem, I could fix it upgrading the hudi library version to 1.15 or higher