Search code examples
scalaapache-sparkavrospark-avro

How to read decimal logical type into spark dataframe


I have an Avro file containing a decimal logicalType as follow:

"type":["null",{"type":"bytes","logicalType":"decimal","precision":19,"scale":2}]


when I try to read the file with scala spark library the df schema is

MyField: binary (nullable = true)


How can I convert it into a decimal type?


Solution

  • You can specify schema in read operation:

    val schema = new StructType()
        .add(StructField("MyField", BooleanType))
    

    or you can cast column

    val binToInt: String => Integer = Integer.ParseInt(_, 2);
    val binToIntegerUdf = udf(binToInt);
    
    df.withColumn("Myfield", binToIntegerUdf(col("MyField").cast("string")))