Search code examples
amazon-web-servicesamazon-athena

Is it possible to add fields to struct in an existing AWS Athena table?


I have a table that tracks user actions on a high-throughput site that is defined as (irrelevant fields, etc removed):

CREATE EXTERNAL TABLE `actions`(
  `uuid` string COMMENT 'from deserializer', 
  `action` string COMMENT 'from deserializer', 
  `user` struct<id:int,username:string,country:string,created_at:string> COMMENT 'from deserializer')
PARTITIONED BY ( 
  `ingestdatetime` string)
ROW FORMAT SERDE 
  'org.openx.data.jsonserde.JsonSerDe' 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  's3://<path_to_bucket>'
TBLPROPERTIES (
  'transient_lastDdlTime'='1506104792')

And want to add some more fields to the user data (e.g. level:int to track what level the user was when they performed the action).

Is it possible to alter the table definition to include these new properties, and if so, is it possible to configure default values in the event that they aren't in the source data files?


Solution

  • No, You can't add a new column to struct in Athena. You can delete Schema and then create a new Table with required columns. Deleting schema or database won't affect your data because Athena doesn't store data itself, it just points to data in S3.