Search code examples
jsongoogle-bigquerynested

How to insert JSON data from a BigQuery Json table into a nested table?


I have a BigQuery table called json_data with a single column json_column of type STRING. The JSON data stored in this column has the following structure:

{
  "id": 1,
  "name": "John Doe",
  "email": "john@example.com",
  "addresses": [
    {
      "street": "123 Main St",
      "city": "New York",
      "country": "USA"
    },
    {
      "street": "456 Elm St",
      "city": "Los Angeles",
      "country": "USA"
    }
  ]
}

I want to insert this JSON data into a nested table called users in BigQuery, which has the following schema:

CREATE TABLE users (
  id INT64,
  name STRING,
  email STRING,
  addresses ARRAY<STRUCT<
    street STRING,
    city STRING,
    country STRING
  >>
)

Can someone please guide me on how to write a SQL query to achieve this? I would like to extract the JSON data from the json_data table and insert it into the users table with the appropriate structure.

Attempt: I tried the below query but it did not give the expected response:

INSERT INTO users (id, name, email, addresses)
SELECT 
  JSON_EXTRACT_SCALAR(json_column, '$.id') AS id,
  JSON_EXTRACT_SCALAR(json_column, '$.name') AS name,
  JSON_EXTRACT_SCALAR(json_column, '$.email') AS email,
  JSON_EXTRACT_ARRAY(json_column, '$.addresses') AS addresses
FROM json_data

But I get an error:

Query column 4 has type ARRAY<STRING> which cannot be inserted into column addresses, which has type ARRAY<STRUCT<street STRING, city STRING, country STRING>> at [2:1]

Any help would be greatly appreciated. Thank you!


Solution

  • I think you should try something like below:

    -- Insert the data from the source_table into the destination_table 
    INSERT INTO destination_table (id, name, email, addresses) 
    SELECT 
     JSON_EXTRACT_SCALAR(json_data, '$.id') AS id, 
     JSON_EXTRACT_SCALAR(json_data, '$.name') AS name, 
     JSON_EXTRACT_SCALAR(json_data, '$.email') AS email, 
     ARRAY( 
       SELECT AS STRUCT 
         JSON_EXTRACT_SCALAR(address, '$.street') AS street, 
         JSON_EXTRACT_SCALAR(address, '$.city') AS city, 
         JSON_EXTRACT_SCALAR(address, '$.country') AS country 
       FROM UNNEST(JSON_EXTRACT_ARRAY(json_data, '$.addresses')) AS address 
     ) AS addresses 
    FROM source_table; 
    

    Most of the things are obvious but the array of structs extraction is explained below :

    • For the addresses column, an array of structs is constructed using the ARRAY and SELECT AS STRUCT syntax.
    • Within the array construction, JSON_EXTRACT_SCALAR is used again to extract values from each address object within the JSON array.
    • The FROM UNNEST clause is used to unnest the JSON array, treating each address object as a separate row.