Search code examples
sqlhivesubqueryself-join

self-joining in hive using different where conditions


I'm working with Hive for the first time and it doesn't support pivot. So I'm trying to either do a series of self-joins or column expressions to get what need. I'm running into errors.

My first table (Employees) looks like this. There's four categorical groups in the role column that I'd like to turn into columns using group bys. One for C-Suite, one for D-Suite, etc. with the alias of the employees as the values.

alias name role oversight functional_area
jbob Jo Bob C-Suite Finance Orders
jndo Jon Doe D-Suite Finance Orders
jndo Jon Doe D-Suite E-Commerce Orders
tweb Tim Webb SME Analytics Fraud
kpow Kat Power C-Suite Controls Architecture
lguan Lin Guan D-Suite Sales Marketing
phar Pat Harms Manager HR HR

This is what I'm working with:

SELECT
    U.alias
    U.name
    U.role
    U.oversight
    U.functional_are
FROM employee U
LEFT OUTER JOIN 
   (SELECT 
       c.alias, c.oversight
   FROM employee c
   WHERE role = 'C-Suite' AND oversight <> 'N/A'
   GROUP BY oversight, alias)
ON U.alias = c.alias 

And I'm getting error messages on the join when I try and just self-join c-suite. I also get an error when I try and put c.alias as c_suite in the select statement and not do a subquery. In R this is done with gather and spread, but I'm getting confused with Hive syntax.


Solution

  • For anyone else coming across this question, I also have a piece of proprietary software in addition to Hive that I'm using to make extracts and it cherrypicks SQL syntax. Some functions work, some don't. CASE worked for me. Not the cleanest query, but it works like pivot. I'm able to make joins after from statement.

    SELECT 
    CASE WHEN U.role = 'C-Suite' THEN U.alias 
    ELSE 'NULL' END AS "C_Alias",
    CASE WHEN U.role = 'D-Suite' THEN U.alias
    ELSE 'NULL' END AS "D_Alias",
    CASE WHEN U.role = 'SME' THEN U.alias
    ELSE 'NULL' END AS "SME_Alias",
    CASE WHEN U.role = 'Manager' THEN U.alias
    ELSE 'NULL' END AS "Manager_Alias",
    U.role
    ,U.name
    ,U.oversight
    ,U.functional_area
    FROM employees U