Search code examples
sqlamazon-web-servicesamazon-athenaprestotrino

Convert String column to JSON in AWS Athena


I am currently having table with one column name event (string) in athena external table and i just want to get that value as a JSON. I am using below query but it converts it into string:

select CAST(event AS JSON) AS json_event from table); 

output is appending \ with every double quote "

value in table:

{"att1": "a", "att2" : "a2"}

above query after CAST() return:

"{\"att1\": \"a\", \"att2\" : \"a2\"}

Moreover, when i run query:

select json_event.attribute
from
(select CAST(event AS JSON) AS json_event from table);

it throws exception:

TYPE_MISMATCH: line 1:8: Expression json_event is not of type ROW

Am i doing something wrong? Thanks in advance for the help.


Solution

  • Use json_parse to get JSON value deserialized from the input JSON text:

    select json_parse('{"att1": "a", "att2" : "a2"}');
    

    Output:

              _col0
    --------------------------
     {"att1":"a","att2":"a2"}
    

    UPD

    The event.userid syntax can be used to access fields of ROW datatype, not JSON, if you want to extract some information from JSON string you need to use different approaches (depending on what actually is needed). For example in this particular case you don't need even to parse the string as JSON just use json_extract_scalar (can be used both on varchar's and JSON types):

    select json_extract_scalar('{"att1": "a", "att2" : "a2", "id": 42}', '$.id');
    

    Output:

     _col0
    -------
     42