i have created bucketed table called emp_bucket into 4 buckets clustered on salary column. The structure of the table is as below:
hive> describe Consultant_Table_Bucket;
OK
id int
age int
gender string
role string
salary double
Time taken: 0.069 seconds, Fetched: 5 row(s)
I also have a staging table from where i can insert data into the above bucketed table. Below is the sample data in the staging table:
id age Gender role salary
-----------------------------------------------------
938 38 F consultant 55038.0
939 26 F student 33319.0
941 20 M student 97229.0
942 48 F consultant 78209.0
943 22 M consultant 77841.0
My requirement is to load data into the bucketed table for those employees whose salary is greater than 10,000 and while loading i have to convert "consultant" role to BigData consultant role.
I know how to insert data into my bucketed table using the select command, but need some guidance how can the consultant value in the role column above can be changed to BigData consultant while inserting.
Any help appreciated
Based on your insert
, you just need to work on the role
part of your select
:
INSERT into TABLE bucketed_user PARTITION (salary)
select
id
, age
, gender
, if(role='consultant', 'BigData consultant', role) as role
, salary
FROM
stage_table
where
salary > 10000
;