I would like to do a LATERAL JOIN in Hive. Is there any way to support this? In essence I want to use values in the rows on the LHS as parameters to arbitrary SQL on the RHS.
Here is an example from Postgres: (Forgive my crude example):
create table lhs (
subject_id integer,
date_time BIGINT );
create table events (
subject_id integer,
date_time BIGINT,
event_val integer );
SELECT * from lhs LEFT JOIN LATERAL ( select SUM(event_val) as val_sum, count(event_val) as ecnt from events WHERE date_time < lhs.date_time and subject_id = lhs.subject_id ) rhs1 ON true;
Hive doesn't support LEFT JOIN LATERAL , use below query which is equivalent to your query.I have tested with sample data ,It is producing same result.
select subject_id,date_time,SUM(event_val) as val_sum,COUNT(event_val) as ecnt
from (SELECT a.subject_id as subject_id ,
a.date_time as date_time, b.date_time as bdate , b.event_val as event_val
FROM events b LEFT OUTER JOIN lhs a
ON b.subject_id = a.subject_id) abc
where bdate < date_time group by subject_id,date_time;
Hope that I will help you to formulated things how you can achieve the same in hive.