Search code examples
jsonamazon-web-servicesamazon-redshiftamazon-redshift-spectrum

AWS Redshift spectrum ERROR: syntax error at or near "json"


I want to create an external table using redshift spectrum, files are stored in s3 as json file having a single object with values as an array of objects.

my query for the external table

CREATE EXTERNAL TABLE jatinspectrum.extable (
enteries array<struct<title:varchar(4000),link:varchar(4000),author:varchar(4000),published_date:timestamp,category:array<varchar(4000)>>>
)
stored as json
LOCATION 's3://xxxxxxxxxxxxxx/xxxxxxxxxxxxxx/xxxxxxxxxxx/';

it throws an error ERROR: syntax error at or near "json" Position: 198

just to make sure that i am syntactically right i saved it as text file using another query

CREATE EXTERNAL TABLE jatinspectrum.extab (
enteries array<struct<title:varchar(4000),link:varchar(4000),author:varchar(4000),published_date:timestamp,category:array<varchar(4000)>>>
)
stored as textfile
LOCATION 's3://xxxxxxxxxxxxxx/xxxxxxxxxxxxxx/xxxxxxxxxxx/';

this worked perfectly but this won't help me since nested data is not supported for text query.


Solution

  • This worked for me, need to changes this stored as textfile to this

    ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
    stored as textfile
    

    https://docs.aws.amazon.com/athena/latest/ug/parsing-JSON.html