I need to extract all rows from a large data set with 30+ columnns based on the min time date, account can repeat across days and execution dates, execution id is distinct and can be thought of as a transaction.
account | execution_id | execution_datetime | execution_date |
---|---|---|---|
1 | 918-46 | 2023-04-03 15:59:09 | 2023-04-03 |
1 | 740-a2 | 2023-04-03 15:56:09 | 2023-04-03 |
1 | 747-9b | 2023-04-03 15:54:09 | 2023-04-03 |
2 | 2bb-14 | 2023-04-03 15:54:09 | 2023-04-03 |
1 | 818-a5 | 2023-04-04 15:47:08 | 2023-04-04 |
Ordinarily something like this would work
select omd.*
from db1 as omd
inner join (
select id, min(execution_datetime)
from db1
group by execution_date,account,id
) as omd1
on omd.id = omd1.id;
This is returning all records vs expected outcome.
account | execution_id | execution_datetime | execution_date |
---|---|---|---|
1 | 747-9b | 2023-04-03 15:54:09 | 2023-04-03 |
2 | 2bb-14 | 2023-04-03 15:54:09 | 2023-04-03 |
1 | 818-a5 | 2023-04-04 15:47:08 | 2023-04-04 |
I'm also can envision there will be a follow-up request to group the transactions across a date range of X so that min transactions across X days will be grouped,and the output might look like this.
account | execution_id | execution_datetime | execution_date |
---|---|---|---|
1 | 747-9b | 2023-04-03 15:54:09 | 2023-04-03 |
2 | 2bb-14 | 2023-04-03 15:54:09 | 2023-04-03 |
Looking for a solution that will hopefully support either scenario, thank you.
This is a job for the window function row_number().
select *
from (
select *,
row_number() over(partition by execution_date, account,id order by
execution_datetime) as row_num
from db1 )
where row_num = 1;
You may want to suppress row_numb from the output and to do this you will need list list all 30 columns that you want.
(sorry if there are any typos)