Search code examples
jsongoogle-bigqueryjsonpath

Find all values with some key located in separate property using JSONPath


My customer stores data in GCP BigQuery service as table with column named people with type string containing JSON data

One part of data look like:

{
  "id": "someid",
  "data": [
    {
      "key": "country",
      "valueString": "USA"
    },
    {
      "key": "name",
      "valueString": "Peter"
    }
  ]
}

The other one:

{
  "id": "someid",
  "data": [
    {
      "key": "age",
      "valueString": "23"
    },
    {
      "key": "country",
      "valueString": "France"
    },
    {
      "key": "name",
      "valueString": "Peter"
    }
  ]
}

I want to create a query that can find all values where key="country":

country|
-------|
USA    |
-------|
France |
-------|
USA    |
-------|
.......|

It seems to me that I need to use the JSONPath query language. My initial version is:

SELECT JSON_EXTRACT_SCALAR(people, "$.data[0].valueString") AS country
FROM table_name

But it doesn't solve the problem, because this field can store in 0, 1,... n elements.


Solution

  • Consider below [super]simple approach

    select json_extract_scalar(json, '$.valueString') as country
    from your_table, unnest(json_extract_array(people, '$.data')) json
    where json_extract_scalar(json, '$.key') = 'country'    
    

    if applied to sample data in your question - output is

    enter image description here