I have 2 tables "items" and "customers".
In both tables, "customer_id" is present and a single customer can have more than 1 item. In the "items" table there is also a timestamp field called "date_created" when an item was purchased.
I want to construct a query that can return each customer_id and item_id associated with the first item each customer bought in a specific year, let's say 2020.
My approach was
SELECT customer_id, items
INNER JOIN items ON items.customer_id=customers.customer_id
and then try to use the EXTRACT function to take care of the first item each customer bought in 2020 but I can't seem to extract the first item only for the specific year. I would really appreciate some help. I am using PostgreSQL. Thank you.
Just use distinct on
:
select distinct on (customer_id) i.*
from items i
where date_created >= date '2020-01-01' and
date_created < date '2021-01-01'
order by customer_id, date_created;
First, note the use of direct date comparisons. This makes it easier for the optimizer to choose the best execution plan.
distinct on
is a handy Postgres extension that returns the first row encountered for the keys in parentheses, which must be the first keys in the order by
. "First" is based on the subsequent order by
keys.