Search code examples
amazon-athenaprestotrino

array to string with prestodb


I have an AWS Athena (parquet) database that in the Athena console returns:

enter image description here

.. from the following query:

SELECT id, 
TYPEOF(organisations) as type,
cardinality(organisations) as len,
organisations
FROM mydb.tbl;

Reading the full table remotely into Python (with awswrangler) is failing because organisations is array (sometimes zero length). I've been trying to convert the organisaions field to a string representation of the array (or equivalent json) but nothing I've tried so far works. Any sugestions much appreciated.


Solution

  • If you don't care about preserving the organization and charoffset field names, you can cast the array to JSON and then format it as varchar with json_format:

    WITH t(organizations) AS (
       VALUES
            ARRAY[ROW('x', 1), ROW('y', 2)],
            ARRAY[ROW('a', 3), ROW('b', 4)],
            ARRAY[]
    )
    SELECT json_format(CAST(organizations AS JSON))
    FROM t
    

    =>

           _col0
    -------------------
     [["x",1],["y",2]]
     [["a",3],["b",4]]
     []