sql database optimization db2 query-optimization

DB2: Select rows where the table1 date is older than date in table2

I am trying to optimise my SQL query for use in a big DB2 table with 10,000s of records. I need to extract rows from Table 1, where the date is older than the rows in Table 2. Table 2 can have many rows that are more recent but I only care about the Table 1 data. I have a query but it's really slow and I am looking to optimise it further. I have gone for this approach:

SELECT * FROM TABLE1 t1
WHERE t1.START_DATE < (
    SELECT MAX(DATE(t2.TIMESTAMP_DATE))
    FROM TABLE2 t2
    WHERE t1.NAME = t2.NAME
);

Here is an example dataset which runs fine with low volumes but my big dataset is running too slow...

Table 1:

NAME	START_DATE
JOHN	2018-01-01
JIM	2019-01-01
ALEX	2023-01-01

Table 2:

NAME	TIMESTAMP_DATE
JOHN	2022-02-22-09.20.23.891000
JOHN	2022-03-22-09.20.23.891000
JOHN	2022-04-22-09.20.23.891000
JIM	2022-01-22-09.20.23.891000
JIM	2022-02-22-09.20.23.891000
ALEX	2022-05-01-09.20.23.891000

So from my query, I want to extract the follow dataset:

NAME	START_DATE
JOHN	2018-01-01
JIM	2019-01-01

Could I please get any tips to optimise for a massive dataset in both tables?

Solution

Building a list of names and max dates from T2 and joining is more efficient, try this :

select
  t1.* 
from table1 t1
inner join (
  select 
    name,
    date(max(timestamp_date)) maxdate
  from table2 t2
  group by name
) b on b.name = t1.name
where t1.start_date < b.maxdate