Search code examples
hadoophiveapache-pighiveql

Add a fixed value column to a table in Hive


I am prototyping my pig script to hive. I need to add a status column to the table which is imported from Oracle database.

My pig scripts looks like this:

user_data = LOAD 'USER_DATA' USING PigStorage(',') AS (USER_ID:int,MANAGER_ID:int,USER_NAME:int);

user_data_status = FOREACH user_data GENERATE
USER_ID,
MANAGER_ID,
USER_NAME,
'active' AS STATUS;

Here I am adding the STATUS column with 'active' value to the user_data table. How can I add column to an existing table to add column while importing the table via Hive QL??


Solution

  • As far as I know, You will have to reload the data as you did in Pig. For example, If you already have the table user_data with columns USER_ID:int,MANAGER_ID:int,USER_NAME:int and you are looking for USER_ID:int,MANAGER_ID:int,USER_NAME:int, STATUS:active

    You can re-load the table user_data_status by using something like this

    INSERT OVERWRITE TABLE user_data_status SELECT *, 'active' AS STATUS FROM user_data;
    

    Though there are options to add columns to the existing table, that would only update the metadata in metastore and the values would be defaulted to NULL.

    If I was you, I would rather re-load the complete data rather than looking to update the complete table using UPDATE command after Altering the column structure. Hope this helps !