Search code examples
amazon-web-servicesamazon-s3hivehiveqlamazon-athena

Exclude certain S3 folders while creating a view on Hive


I am trying to create a view to fetch data from a bucket by excluding certain folders inside S3 on Hive. I was able to successfully create view on Athena, but couldn't do the same on Hive.

Athena View:

CREATE VIEW test
as
SELECT *
FROM TABLE_A
WHERE NOT ("$PATH LIKE '%PASSENGER_DATA%')
AND NOT ("$PATH LIKE '%CUSTOMER_DATA%');

Could you please advise how the same could be achieved on Hive?


Solution

  • You may actually wish to consider moving the data into its own folders. Then you could build them as a table with:

    CREATE VIEW TABLE_DATA
    as
    SELECT *
    FROM TABLE_A --("$PATH LIKE '%PASSENGER_DATA%')
    UNION
    SELECT *
    FROM TABLE_B -- ("$PATH LIKE '%CUSTOMER_DATA%');
    

    THis likely will also make your permission issues easier to manage.

    And when needed you could easily use one table or both tables.