Search code examples
sqlamazon-athenapresto

How to fetch info from curly braces using Athena


I am trying to fetch info using Athena, I don't know much about the Athena service of AWS. Could anyone help me how to fetch this particular info i.e. "Attribute4": "RSQ". I am pasting a demo data in JSON format to give better understanding how the data is structured.

I will grateful if you could give explanation and any document that will help me to enrich my knowledge regarding Athena querying style.

{ "ID": "1234", "Name": "XYZ", "Age": "29", "address": { "street": "ABC", "houseno": "PRQ", "attributeDetails": { "Attribute1": "FGH", "Attribute2": "KLM", "Attribute3": "LMN", "Attribute4": "RSQ" }

Athena Query which I am trying to write

select ID, address.attributeDetails FROM "TableName"


Solution

  • Athena is using Presto sql engine which has several functions to work with json, in this case you can use json_extract_scalar to get the id and json_extract for attributeDetails:

    -- sample data
    WITH dataset(json_str) AS (
        VALUES ('{ "ID": "1234", "Name": "XYZ", "Age": "29", "address": { "street": "ABC", "houseno": "PRQ", "attributeDetails": { "Attribute1": "FGH", "Attribute2": "KLM", "Attribute3": "LMN", "Attribute4": "RSQ" }')
    )
    
    -- query
    SELECT json_extract_scalar(json_str, '$.ID'), json_extract(json_str, '$.address.attributeDetails')
    FROM dataset
    

    Output:

    _col0 _col1
    1234 {"Attribute1":"FGH","Attribute2":"KLM","Attribute3":"LMN","Attribute4":"RSQ"}