I am trying to cast an array as Decimal(30,0) for use in a select dynamically as:
WHERE array_contains(myArrayUDF(), someTable.someColumn)
However when casting with:
val arrIds = someData.select("id").withColumn("id", col("id")
.cast(DecimalType(30, 0))).collect().map(_.getDecimal(0))
Databricks accepts that and signature however already looks wrong to be: intArrSurrIds: Array[java.math.BigDecimal] = Array(2181890000000,...) // ie, a BigDecimal
Which results in the below error:
Error in SQL statement: AnalysisException: cannot resolve.. due to data type mismatch: Input to function array_contains should have been array followed by a value with same element type, but it's [array<decimal(38,18)>, decimal(30,0)]
How do you correctly cast as decimal(30,0) in Spark Databricks Scala notebook instead of decimal(38,18) ?
Any help appreciated!
You can make arrIds
an Array[Decimal]
using the code below:
import org.apache.spark.sql.functions.col
import org.apache.spark.sql.types.{Decimal, DecimalType}
val arrIds = someData.select("id")
.withColumn("id", col("id").cast(DecimalType(30, 0)))
.collect()
.map(row => Decimal(row.getDecimal(0), 30, 0))
However, it will not solve your problem because you lose the precision and scale once you create your user defined function, as I explain in this answer
To solve your problem, you need to cast the column someTable.someColumn
to Decimal with the same precision and scale than the UDF returned type. So your WHERE
clause should be:
WHERE array_contains(myArray(), cast(someTable.someColumn as Decimal(38, 18)))