Search code examples
javahiveudf

Hive Udf, Struct type loses type information. Is there anyway to recover type information


My table has mostly double type columns and some string columns. I created the table using row format serde 'org.openx.data.jsonserde.JsonSerDe' from a text file. I first combine these columns using named_struct function and pass it to my udf. Something like this.

select id, my_udf(named_struct("key1", col1, "key2", col2, "key3",col3, "key4", col4), other_udf_param1, other_udf_param2);

So, col1, col2 and col3 are of double type and col4 is of type String.

But all of them get converted as String.

This is a snippet from my evaluate function.

List<? extends StructField> fields = this.dataOI.getAllStructFieldRefs();

    for (int i = 0; i < fields.size(); i++) {
        System.out.println(fields.get(i).toString());
        String canName = this.featuresOI.getStructFieldData(arguments[2].get(), fields.get(i)).getClass().getCanonicalName();
        System.out.println(canName + " can name");
        System.out.println(this.dataOI.getStructFieldData(arguments[2].get(), fields.get(i)));
                }

This returns all of them as strings.

Is there a way I could preserve the column types?


Solution

  • Yes, the column types are preserved in the field Object Inspector. Same behaviour can be oberved on the hive cli for named_struct, for map however the inputs are all converted to strings.