Search code examples
amazon-s3hiveanalysisamazon-athena

Athena returns blank response for Partitioned data, what am I missing?


I have created a table using partition. I tried two ways for my s3 bucket folder as following but both ways I get no records found when I query with where clause containing partition clause.

My S3 bucket looks like following. part*.csv is what I want to query in Athena. There are other folders at same location along side output, within output.

s3://bucket-rootname/ABC-CASE/report/f78dea49-2c3a-481b-a1eb-5169d2a97747/output/part-filename121231.csv s3://bucket-rootname/XYZ-CASE/report/678d1234-2c3a-481b-a1eb-5169d2a97747/output/part-filename213123.csv

my table looks like following Version 1:

CREATE EXTERNAL TABLE `mytable_trial1`(
  `status` string, 
 `ref` string)
  PARTITIONED BY ( 
  `casename` string, 
  `id` string)
ROW FORMAT DELIMITED 
  FIELDS TERMINATED BY ',' 
LOCATION
  's3://bucket-rootname/'
TBLPROPERTIES (
  'has_encrypted_data'='false', 
  'skip.header.line.count'='1') 

ALTER TABLE mytable_trial1 add partition (casename="ABC-CASE",id="f78dea49-2c3a-481b-a1eb-5169d2a97747") location "s3://bucket-rootname/casename=ABC-CASE/report/id=f78dea49-2c3a-481b-a1eb-5169d2a97747/output/";

select * from mytable_trial1 where casename='ABC-CASE' and report='report' and id='f78dea49-2c3a-481b-a1eb-5169d2a97747' and foldername='output';

Version 2:

 CREATE EXTERNAL TABLE `mytable_trial1`(
      `status` string, 
     `ref` string)
      PARTITIONED BY ( 
      `casename` string, 
      `report` string,
       `id` string,
      `foldername` string)
    ROW FORMAT DELIMITED 
      FIELDS TERMINATED BY ',' 
    LOCATION
      's3://bucket-rootname/'
    TBLPROPERTIES (
      'has_encrypted_data'='false', 
      'skip.header.line.count'='1') 

ALTER TABLE mytable_trial1 add partition (casename="ABC-CASE",report="report",id="f78dea49-2c3a-481b-a1eb-5169d2a97747",foldername="output") location "s3://bucket-rootname/casename=ABC-CASE/report=report/id=f78dea49-2c3a-481b-a1eb-5169d2a97747/foldername=output/";

select * from mytable_trial1 where casename='ABC-CASE' and id='f78dea49-2c3a-481b-a1eb-5169d2a97747'

Show partitions shows this partition but no records found with where clause.


Solution

  • I worked with the AWS Support and we were able to narrow down the issue. Version 2 was right one to use since it has four partitions like my S3 bucket. Also, the Alter table command had issue with location. I used hive format location which was incorrect since my actual S3 location is not hive format. So correcting the command to following worked for me.

    ALTER TABLE mytable_trial1 add partition (casename="ABC-CASE",report="report",id="f78dea49-2c3a-481b-a1eb-5169d2a97747",foldername="output") location "s3://bucket-rootname/ABC-CASE/report/f78dea49-2c3a-481b-a1eb-5169d2a97747/output/";
    

    Preview table now shows my entries.