Search code examples
parquetparquet-mr

How to use the Parquet UUID Logical Type in a schema


Somewhat recently, the parquet-format project added a UUID logical type. Specifically, this was added in revision 2.4 of the parquet format. I'm interested in using the parquet-mr library in Java to create some parquet files, but I can't seem to figure out how to use the UUID logical type in a parquet schema. A simple schema like this does not seem to work as I would hope:

message SimpleSchema {
  required int32 value1;
  required fixed_len_byte_array(16) value2 ( UUID );
}

I've tried many variations on this schema, and so far haven't managed to get it to parse with the MessageTypeParser.parseMessageType method. Is this a bug or limitation with the parquet-mr library? Or am I just formatting my schema incorrectly? Thanks!


Solution

  • The parquet-mr library currently doesn't support the UUID logical type. There's an issue to track progress in implementing this feature here.