I'm trying to write that simple dataframe
val df = Seq(
List("aa","bb","cc"),
List("aa","bb","cc"),
List("aa","bb","cc")
).toDF("str")
to the clickhouse table
CREATE TABLE IF NOT EXISTS testtable (
str Array(String)
) ENGINE = Memory
That is writing process:
df.write
.format("jdbc")
.mode("append")
.option("driver", "com.clickhouse.jdbc.ClickHouseDriver")
.option("url", "jdbcurl")
.option("user", "user")
.option("password", "password")
.option("dbtable", "testtable")
.save
And during processing i faced with that error:
java.lang.IllegalArgumentException: Can't get JDBC type for array<string>
I tried to convert array to the string this way df.withColumn("str", concat_ws(",", $"str"))
, but then I faced with that error:
java.sql.SQLException: Unsupported type ARRAY
Meanwhile the data was written to the database, but not in format that I expected (["aa,bb,cc"] instead of ["aa", "bb", "cc"] - three separated values).
Also official clickhouse-java docs mentions, that if you want to use arrays, you have to prepare statement with setArray
function, but I don't understand, how exactly I can do this in spark. Can anybody help me with that?
I solved this issue with switching from clickhouse-jdbc
driver to clickhouse-native-jdbc
driver by Housepower: https://housepower.github.io/ClickHouse-Native-JDBC/
Also you need to register ClickHouseDialect, all details are mentioned in link above (section Spark Integration).