I want to assign a value based on the condition in CASE WHEN
statement for other rows associated with a user_id
.
In step3
, the journey differs. So when step = 'step3_eu'
then I want to create a column called type
and assign 'Category 2'
for the rows associated with that user_id
, else 'Category 1'
.
Here is the sample data:
+---------+----------+---------------------+
| user_id | step | created_at |
+---------+----------+---------------------+
| 7bc6de | step1 | 2021-01-13 08:47:39 |
| 7bc6de | step2 | 2021-01-14 06:12:30 |
| 7bc6de | step3 | 2021-01-18 07:29:16 |
| 7bc6de | step4 | 2021-01-18 11:10:08 |
| 7bc6de | step5 | 2021-01-19 08:31:30 |
| 0ee6df | step1 | 2021-02-08 23:31:55 |
| 0ee6df | step2 | 2021-02-12 19:50:53 |
| 0ee6df | step3_eu | 2021-02-13 00:04:34 |
| 0ee6df | step4 | 2021-02-18 16:42:41 |
| 0ee6df | step5 | 2021-02-22 08:52:30 |
+---------+----------+---------------------+
Here is the desired output:
+---------+----------+---------------------+------------+
| user_id | step | created_at | type |
+---------+----------+---------------------+------------+
| 7bc6de | step1 | 2021-01-13 08:47:39 | Category 1 |
| 7bc6de | step2 | 2021-01-14 06:12:30 | Category 1 |
| 7bc6de | step3 | 2021-01-18 07:29:16 | Category 1 |
| 7bc6de | step4 | 2021-01-18 11:10:08 | Category 1 |
| 7bc6de | step5 | 2021-01-19 08:31:30 | Category 1 |
| 0ee6df | step1 | 2021-02-08 23:31:55 | Category 2 |
| 0ee6df | step2 | 2021-02-12 19:50:53 | Category 2 |
| 0ee6df | step3_eu | 2021-02-13 00:04:34 | Category 2 |
| 0ee6df | step4 | 2021-02-18 16:42:41 | Category 2 |
| 0ee6df | step5 | 2021-02-22 08:52:30 | Category 2 |
+---------+----------+---------------------+------------+
SELECT
*,
CASE
WHEN bool_or(step = 'step3_eu') OVER (PARTITION BY user_id) THEN 'Category 2'
ELSE 'Category 1'
END as type
FROM mytable
You can use bool_or()
window function. This checks if there is any record with value step3_eu
within a group (= partition of user_id
).