In a dataframe like below:
id date product
1 2010-02-01 c
1 2010-02-02 v
1 2010-02-03 d
1 2010-02-04 g
2 2010-02-03 h
2 2010-02-04 w
2 2010-02-05 t
2 2010-02-06 d
3 2010-02-04 x
3 2010-02-05 f
3 2010-02-06 x
I want to add another column giving the product that the user used in the minimum date. So it needs to be like below:
id date product early_product
1 2010-02-01 c c
1 2010-02-02 v c
1 2010-02-03 d c
1 2010-02-04 g c
2 2010-02-03 h h
2 2010-02-04 w h
2 2010-02-05 t h
2 2010-02-06 d h
3 2010-02-04 x x
3 2010-02-05 f x
3 2010-02-06 x x
I know I need to use the window function and started with the one below. But that will provide me the earliest date per user. I need to find the product used in the earliest day for user:
min(date) over (partition by id) as earliest_date
Use FIRST_VALUE as Window function
CREATE TABLE table1 ( `id` INTEGER, `date` Date, `product` VARCHAR(1) ); INSERT INTO table1 (`id`, `date`, `product`) VALUES ('1', '2010-02-01', 'c'), ('1', '2010-02-02', 'v'), ('1', '2010-02-03', 'd'), ('1', '2010-02-04', 'g'), ('2', '2010-02-03', 'h'), ('2', '2010-02-04', 'w'), ('2', '2010-02-05', 't'), ('2', '2010-02-06', 'd'), ('3', '2010-02-04', 'x'), ('3', '2010-02-05', 'f'), ('3', '2010-02-06', 'x');
SELECT `id`, `date`, `product` , FIRST_VALUE(`product`) OVER(PARTITION BY `id` ORDER BY `date` ROWS UNBOUNDED PRECEDING) minproduct FROM table1
id | date | product | minproduct -: | :--------- | :------ | :--------- 1 | 2010-02-01 | c | c 1 | 2010-02-02 | v | c 1 | 2010-02-03 | d | c 1 | 2010-02-04 | g | c 2 | 2010-02-03 | h | h 2 | 2010-02-04 | w | h 2 | 2010-02-05 | t | h 2 | 2010-02-06 | d | h 3 | 2010-02-04 | x | x 3 | 2010-02-05 | f | x 3 | 2010-02-06 | x | x
db<>fiddle here