Bytes values in pySpark Dataframe

I have a PySpark dataframe which contains a column containing Bytes in a nested dictionary so the data is look like this: Col_name: "{"bytes":"\u0014ok\u0000"} and so on, the logical type of this field is DECIMAL so it should return a decimal value, but I need first to cast it as Binary and when I cast using the following code the extracted value is incorrect, anyone can help on this? thanks

df = df.withColumn("col_name", col("col_name").cast("binary"))

Solution

here is my solution.

from pyspark.sql.functions import *
from pyspark.sql.types import MapType, StringType, IntegerType

jsonString="""{"bytes":"\\u0014\\u0000"}"""
df = spark.createDataFrame(data=[(1, jsonString)],schema=["id","value"])

df.show(truncate = False)
def convertColumn(str):
    #convert you integer here
    return 10

parseInt = udf(lambda z: convertColumn(z), IntegerType())

res = df.withColumn("parsed", parseInt(element_at(from_json(col("value"), "MAP<STRING,STRING>"), "bytes")))

res.show(truncate = False)

Output:

+---+------------------------+
|id |value                   |
+---+------------------------+
|1  |{"bytes":"\u0014\u0000"}|
+---+------------------------+

+---+------------------------+------+
|id |value                   |parsed|
+---+------------------------+------+
|1  |{"bytes":"\u0014\u0000"}|10    |
+---+------------------------+------+