I use apache parquet to create Parquet tables with process information of a machine and I need to store file wide metadata (Machine ID and Machine Name).
It is stated that parquet files are capable of storing file wide metadata, however i couldn't find anything in the documentation about it.
There is another stackoverflow post that tells how it is done with pyarrow. As far as the post is telling, i need some kind of key value pair (maybe map<string, string>) and add it to the schema somehow.
I Found a class inside the parquet source code that is called parquet::FileMetaData that may be used for this purpose, however there is nothing in the docs about it.
Is it possible to store file-wide metadata with c++ ?
Currently i am using the stream_reader_writer example for writing parquet files
You can pass the file level metadata when calling parquet::ParquetFileWriter::Open
, see the source code here