Search code examples
hadoophivehadoop-partitioning

Convert value while inserting into HIVE table


i have created bucketed table called emp_bucket into 4 buckets clustered on salary column. The structure of the table is as below:

hive> describe Consultant_Table_Bucket;
OK
id                      int                                         
age                     int                                         
gender                  string                                      
role                    string                                      
salary                  double                                      
Time taken: 0.069 seconds, Fetched: 5 row(s)

I also have a staging table from where i can insert data into the above bucketed table. Below is the sample data in the staging table:

id      age     Gender   role         salary
-----------------------------------------------------
938     38      F       consultant      55038.0
939     26      F       student 33319.0
941     20      M       student 97229.0
942     48      F       consultant       78209.0
943     22      M       consultant 77841.0

My requirement is to load data into the bucketed table for those employees whose salary is greater than 10,000 and while loading i have to convert "consultant" role to BigData consultant role.

I know how to insert data into my bucketed table using the select command, but need some guidance how can the consultant value in the role column above can be changed to BigData consultant while inserting.

Any help appreciated


Solution

  • Based on your insert, you just need to work on the role part of your select:

    INSERT into TABLE bucketed_user PARTITION (salary)
    select
        id
      , age
      , gender
      , if(role='consultant', 'BigData consultant', role) as role
      , salary
    FROM
      stage_table
    where
      salary > 10000
    ;