I need to run the following join without using a correlated subquery, as I am restricted to either using Hive or Presto, both of which fail due to my using a correlated subquery.
I have worked this down to a MWE. I have a table of each user and their 18th birthdays. I have another table of each time each user visited a movie theatre. I want to merge in only the last time a user visited my movie cinema. The code that would work on native SQL is below.
What is the most efficient workaround that does not require me to join every instance of the user visiting the movie theatre (it is far too large).
SELECT
people.*,
tickets.uid
tickets.date
FROM all_customers as people
JOIN tkting as tickets
on people.uid = tickets.uid
and tickets.date = (select
lastvisit.date
from tickets as lastvisit
where
lastvisit.uid = people.uid
and lastvisit.date < people.birthday_18
order by lastvisit.date asc
limit 1)
Instead of this inner query:
SELECT lastvisit.date
...
ORDER BY lastvisit.date ASC
LIMIT 1
you can try with:
SELECT min(lastvisit.date)
...