I have the following table
ID Date Activity
1 10-20-2019 X
1 10-20-2019 Y
1 10-23-2019 X
1 10-23-2019 Y
1 10-25-2019 Z
2 10-10-2019 X
2 10-15-2019 X
2 10-16-2019 X
2 10-20-2020 X
3 10-12-2020 X
3 10-13-2020 X
3 10-14-2020 Y
3 10-15-2020 Y
4 10-10-2020. Y
4 10-12-2020 Y
4 10-15-2020 Z
4 10-15-2020 Z
Each ID represents a unique person who performs an activity on a certain date. A person may perform multiple activities on the same date (ID 1 performs activity X and Y on the same day multiple times) and this is recorded as two separate entries in the table.
I need an SQL query that returns all the IDs of people who have ever performed a different activity from the task(s) they performed on their first day. The new activity cannot be a subset of the first activity (if ID 1 performs X and Y on one day and then just X on another, this does not count as a different activity).
I used min(date) to find the activity on their first date, but I am having trouble dealing with the case of performing multiple activities on the same day. Also my query takes about 1 min to run which is surprising given I have only 2000 records.
The query run on the table above should ideally return the following:
ID
1
3
4
Thank you in advance for the help.
If I follow you correctly, you can use two levels of aggregation:
select id
from (
select id, activity, min(date) min_date
from mytable t
group by id, activity
) t
group by id
having min(min_date) <> max(min_date)
The idea is to compute the minimum date per id
*and activity
. If a given id
has more than one distinct minimum date, then we know an new activity appeared later on.