When I load parquet files into Bigquery table, values stored are weird. It seems to be the encoding of BYTES fields or whatever else.
Here's the format of the create fields
So when I read the table with casted fields, I get the readable values.
I found the solution here
My question is why is BigQuery behaving like this?
According to this GCP documentation, there are some parquet data types that can be converted into multiple BigQuery data types. A workaround is to add the data type that you want to parse to BigQuery.
For example, to convert the Parquet INT32 data type to the BigQuery DATE data type, specify the following:
optional int32 date_col (DATE);
And another way is to add the schema to the bq load command:
bq load --source_format=PARQUET --noreplace --noautodetect --parquet_enum_as_string=true --decimal_target_types=STRING [project]:[dataset].[tables] gs://[bucket]/[file].parquet Column_name:Data_type