Search code examples
hadoopjoinhivelateral

Can I do a LEFT JOIN LATERAL in Hive?


I would like to do a LATERAL JOIN in Hive. Is there any way to support this? In essence I want to use values in the rows on the LHS as parameters to arbitrary SQL on the RHS.

Here is an example from Postgres: (Forgive my crude example):

create table lhs (
    subject_id integer,
    date_time  BIGINT );

create table events (
    subject_id  integer,
    date_time   BIGINT,
    event_val   integer );

SELECT * from lhs LEFT JOIN LATERAL ( select SUM(event_val) as val_sum, count(event_val) as ecnt from events WHERE date_time < lhs.date_time and subject_id = lhs.subject_id ) rhs1 ON true;

Solution

  • Hive doesn't support LEFT JOIN LATERAL , use below query which is equivalent to your query.I have tested with sample data ,It is producing same result.

    select subject_id,date_time,SUM(event_val) as val_sum,COUNT(event_val) as ecnt 
    from (SELECT a.subject_id as subject_id ,
          a.date_time as date_time, b.date_time as bdate , b.event_val as event_val
          FROM events b LEFT OUTER JOIN lhs a 
          ON b.subject_id = a.subject_id) abc 
    where bdate < date_time group by subject_id,date_time;
    

    Hope that I will help you to formulated things how you can achieve the same in hive.