Search code examples
sqlpostgresqlnullwindow-functionscoalesce

Replace NULL values per partition


I want to fill NULL values in device column for each session_id with an associated non-NULL value. How can I achieve that?

Here is the sample data:

+------------+-------+---------+
| session_id | step  | device  |
+------------+-------+---------+
| 351acc     | step1 |         |
| 351acc     | step2 |         |
| 351acc     | step3 | mobile  |
| 351acc     | step4 | mobile  |
| 350bca     | step1 | desktop |
| 350bca     | step2 |         |
| 350bca     | step3 |         |
| 350bca     | step4 | desktop |
+------------+-------+---------+

Desired output:

+------------+-------+---------+
| session_id | step  | device  |
+------------+-------+---------+
| 351acc     | step1 | mobile  |
| 351acc     | step2 | mobile  |
| 351acc     | step3 | mobile  |
| 351acc     | step4 | mobile  |
| 350bca     | step1 | desktop |
| 350bca     | step2 | desktop |
| 350bca     | step3 | desktop |
| 350bca     | step4 | desktop |
+------------+-------+---------+

Solution

  • The window function first_value() with the right ordering is probably cheapest:

    SELECT session_id, step
         , COALESCE(device
                  , first_value(device) OVER (PARTITION BY session_id ORDER BY device IS NULL, step)
                   ) AS device
    FROM   tbl
    ORDER  BY session_id DESC, step;
    

    db<>fiddle here

    ORDER BY device IS NULL, step sorts NULL values last, so the earliest step with a notnull value is picked. See:

    If notnull devices per session_id are always the same, you can simplify to just ORDER BY device IS NULL. And you don't need COALESCE.