Search code examples
hadoophivehdfsexternal-tables

How to delete an external table in Hive when the hdfs path has been deleted?


I've removed my HDFS path /user/abc, and some Hive tables were stored in /user/abc/data/abc.db , with a rm -R command.

Despite having my regular tables correctly deleted with Hive SQL, my external tables didn't drop, with the following error:

[Code: 1, SQL State: 08S01]  Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Failed to load storage handler:  Error in loading storage handler.org.apache.phoenix.hive.PhoenixStorageHandler)

How can I safely delete the tables?

I tried using:

 delete from TBL_COL_PRIVS where TBL_ID=[myexternaltableID];
 delete from TBL_PRIVS where TBL_ID=[myexternaltableID];
 delete from TBLS where TBL_ID=[myexternaltableID];

But it didn't work with the following error message:

[Code: 10297, SQL State: 42000]  Error while compiling statement: FAILED: SemanticException [Error 10297]: Attempt to do update or delete on table sys.TBLS that is not transactional

Thank you,

NB: I know a schema is supposed to be deleted more safely with HiveQL but on this particular case, this was not done this way.


Solution

  • Here is a supplementary answer to @Eve's above, she has missed some delete queries about partitions. All delete queries of hive metastore below:

    --Get the tbl_id of the table waited to be dropped at first, the left steps are all based on this.
    select TBL_ID from TBLS where tbl_name = 'your_table_name';
    
    --Now we can start the delete operations with the id we got above.
    delete from TABLE_PARAMS where TBL_ID=tblid;
    delete from TBL_COL_PRIVS where TBL_ID=tblid;
    delete from TBL_PRIVS where TBL_ID=tblid;
    delete from PARTITION_KEY_VALS where PART_ID in (select PART_ID from PARTITIONS where TBL_ID=tblid);
    delete from PARTITION_PARAMS where part_id in (select PART_ID from PARTITIONS where TBL_ID=tblid);
    delete from PART_COL_STATS where PART_ID in (select PART_ID from PARTITIONS where TBL_ID=tblid);
    delete from PARTITIONS where TBL_ID=tblid;
    delete from PARTITION_KEYS where tbl_id = tblid;
    delete from TBLS where TBL_ID=tblid;
    

    The same notice: Order is important!