How to alter ORC file's embedded schema?

Is there a light-weight solution to change the datatype of specific column in ORC file without having to convert entire column datatype and re-writing entire orc file?

The following is a heavy-weight solution:

Read orc file in Spark
Convert datatype of a specific column
Write converted orc file to HDFS

Looking for a light-weight solution where I can just alter embedded metadata info.

Thanks!

Solution

It's not the answer that you're looking for, but no you can't change a column type in ORC without re-generating the file. What you're suggesting is the correct way to do it.

ORC includes indexes and aggregated values in the file header, and so changing a string -> double would require the entire column to be scanned so that the min/max/average etc could be calculated for what is now a numerical column.