I want to get subquery from impala table as one dataset.
Code like this:
String subQuery = "(select to_timestamp(unix_timestamp(now())) as ts from my_table) t"
Dataset<Row> ds = spark.read().jdbc(myImpalaUrl, subQuery, prop);
But result is error:
Caused by: java.sql.SQLDataException: [Cloudera][JDBC](10140) Error converting value to Timestamp.
I can use unix_timestamp
function,but to_timestmap
failed, why?
I found code in org.apache.spark.sql.execution.datasources.jdbc.JDBC.compute()
exists some problem:
sqlText = s"SELECT $columnList FROM ${options.table} $myWhereClause"
$columList
contains "
like "col_name"
, when I delete "
it work fine.
I solve this problem by add dialect, default dialect will add ""
to column name,
JdbcDialect ImpalaDialect = new JdbcDialect(){
@Override
public boolean canHandle(String url) {
return url.startsWith("jdbc:impala") || url.contains("impala");
}
@Override
public String quoteIdentifier(String colName) {
return colName;
}
};
JdbcDialects.registerDialect(ImpalaDialect);