Search code examples
jsonamazon-web-servicesaws-glueamazon-athena

Unable to query JSON using Athena : HIVE_METASTORE_ERROR: Error: type expected at the position 0 of 'ARRAY <STRING>' but 'ARRAY' is found


I am getting the following error when I try to query JSON file using a glue table via Athena :

HIVE_METASTORE_ERROR: Error: type expected at the position 0 of 'ARRAY <STRING>' but 'ARRAY' is found. (Service: null; Status Code: 0; Error Code: null; Request ID: null; Proxy: null)
This query ran against the "" database, unless qualified by the query.

The Glue table schema is as follows :

[
  {
    "Name": "id",
    "Type": "string",
    "Comment": ""
  },
  {
    "Name": "eligibleterritorycodes",
    "Type": "ARRAY <STRING>",
    "Comment": ""
  },
  {
    "Name": "excludedterritorycodes",
    "Type": "ARRAY <STRING>",
    "Comment": ""
  },
  {
    "Name": "isdeleted",
    "Type": "boolean",
    "Comment": ""
  },
  {
    "Name": "requestsuccessful",
    "Type": "boolean",
    "Comment": ""
  }
]

And the bucket to which this Glue table points just has one file with a single line :

{"id":"-1000000102013416","eligibleTerritoryCodes":["00"],"excludedTerritoryCodes":["US"],"isDeleted":false,"requestSuccessful":true}

I have also tried setting ignore.malformed.json as "true" for the glue table but to no success. I am not sure why it is giving such error when the glue table structure is exactly as the data in the file. Any help would be appreciated.


Solution

  • I could not get past this error but the following Athena query helped me resolve the issue and unblock me:

    CREATE EXTERNAL TABLE odp_table (
      dmid string,
      eligibleterritorycodes array<string>,
      excludedterritorycodes array<string>,
      isdeleted boolean,
      requestsuccessful boolean
     )
     ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
     WITH SERDEPROPERTIES ('ignore.malformed.json' = 'true')
     LOCATION 's3://odp-update-beta/final_request_models/"json_table" ';
     
     select * from json_table;
    

    So instead of using Glue table, I have used created an external table in athena with the same schema. This helpes me query the JSON file.