Search code examples
sqljsonamazon-athenaprestotrino

Athena Nested Json Extract


I Have this nested json in Athena

 { "template": {"name":"User Name", "id": 10001}} 

Output :- {"name":"User Name", "id": 10001}

I'm trying to extract id from the inner json so far i have tried json extract and json extract scalar but both are not working for nested JSON

with this we can access the template json

JSON_EXTRACT(db.column_name, '$.template') AS template

i'm trying nested json to access the id field but it is inaccessible

json_extract_scalar(json_extract(db.column_name, '$.template'), '$.id') as id
json_extract(json_extract(db.column_name, '$.template'), '$.id') as id

Solution

  • TBH both of your attempts work for me but in general you should be able to access it by providing full path (no need for nested extracts):

    -- sample data
    with dataset(json_col) as(
        values ('{ "template": {"name":"User Name", "id": 10001}}' )
    )
    
    -- query
    select json_extract_scalar(json_col, '$.template.id') id
    from dataset;
    

    Output:

      id
    -------
     10001
    (1 row)
    

    Possibly you have some erroneous rows so you can use try which will return null in case of error:

    -- query
    select try(json_extract_scalar(json_col, '$.template.id')) id
    from dataset;
    

    UPD

    Your actual problem is that template contains a string, not a nested JSON object (and content of the string is actually encoded JSON). You need to parse JSON again - try using json_extract_scalar in the inner part (json_extract_scalar will return varchar string which will be parsed again by the outer one):

    -- sample data
    with dataset(json_col) as(
        values ('{ "template": "{\"project_id\": 1736024, \"sender_name\": \"something\"}"}' )
    )
    
    -- query
    select json_extract_scalar(json_extract_scalar(json_col, '$.template'), '$.project_id') id
    from dataset;
    
      id
    -------
     1736024
    (1 row)