Is there a product that meets these requirements:
Basically, I would like Google BigQuery, but without the requirement for a consistent pre-specified schema.
Apache Drill offers SQL-like queries across large JSON datasets with different schemas. However, I don't think it works well on cloud-hosted datasets - have set it up to talk to an S3 bucket, and it seems to need to download all the data locally, which makes it unusable with large datasets.
BigQuery requires a static schema, but one option is to define the tables as CSV with a separator that doesn't appear in the data, meaning you'd have a single string column. You'll be limited to the JSON_EXTRACT
functions for pulling data from the JSON, but from there you can cast to other data types and so on.
You could also consider filing a feature request for the BigQuery team to consider supporting a document type to make working with this kind of data easier in the future.