Short:
This is the SQL I try to achieve in django:
with start_dates as
(
select m2.person_id, min(m2.date_current) as start_date
from my_table m2
where [...]
group by person_id
)
select *
from my_table m
inner join start_dates sd
on m.person_id = sd.person_id
and m.date_current >= sd.start_date
;
Long:
There is a table which I need to access twice. The table constists of activities of people. Each people has multiple activities so multiple rows in this table.
1)
First access is to find for a set of given filters (like the name of the acitivity and the age of the person) which persons activities fit the filters and per person what is its oldest acitivity fitting the filters. I call that oldest-date "start_date".
On SQL this could look like:
select m2.person_id, min(m2.date_current) as start_date
from my_table m2
where [...]
group by person_id
With this query I found a sub-set of people with their oldest filtered-activity (so there may be also older activities which do not fit the filters above) Sample
A sample would be to find all people ever did taking drugs plus the date when it happened first.
2)
Second access is to get for every person found all its acitivies not older the oldest-date found. Again this still does not contain older activities than the start-data. But this now may contain newer activities which may not fit the filters previously.
SQL
with start_dates as
(
select m2.person_id, min(m2.date_current) as start_date
from my_table m2
where [...]
group by person_id
)
select *
from my_table m
inner join start_dates sd
on m.person_id = sd.person_id
and m.date_current >= sd.start_date
;
Sample
With the second access I know can find out what activities alle these people did after they first took drugs. How they did going.
What I tried:
Using django I am failing and failing to achieve this. There is no primary/foreign key defined and we are not going to do this.
First query is easy on django and works:
queryset1 = MyTable.objects
def get(self, request, pk=None):
[...]
self.queryset1 = self.queryset1.filter(**(prepped_params["filters"]))
self.queryset1 = self.queryset1.values('person_id').annotate(start_date=Min('date_current'))
Now I struggle how to join the people and their first-dates with the same table again.
something like that does not work:
queryset2 = MyTable.object
[...]
self.queryset2.filter(person_id__in=self.queryset1.person_id && date_current__in=self.queryset1.start_date)
My only other approach is to iterate through query2 and check if it exists in query1 using python code. I think this may be very inefficient.
Thanks for help.
You can try like this with Subquery:
from django.db.models import OuterRef, Subquery
subquery = MyTable.objects.filter(person_id=OuterRef('person_id')).order_by('date_current')
queryset = MyTable.objects.annotate(start_date=Subquery(subquery.values('date_current')[:1]))
print(queryset.values('person_id', 'date_current', 'start_date'))