Search code examples
sqlsql-servergroup-byderived-table

Derive Column By Finding Unique String From Column in a Subset


The idea behind this question is similar to this question with a slight change. Consider data below

Actual Output

+-------------+--------+------+
|     Id      | Weight | Type |
+-------------+--------+------+
| 00011223344 |     35 | A    |
| 00011223344 |     10 | A    |
| 12311223344 |    100 | B    |
| 00034343434 |     25 | A    |
| 00034343434 |     25 | A    |
| 99934343434 |    200 | C    |
| 88855667788 |    100 | D    |
+-------------+--------+------+

Derived Output

+-------------+-------------+--------+---------------+------+
|     Id      |  Actual ID  | Weight | Actual Weight | Type |
+-------------+-------------+--------+---------------+------+
| 00011223344 | 12311223344 |     35 |            35 | A    |
| 00011223344 | 12311223344 |     10 |            10 | A    |
| 12311223344 | 12311223344 |    100 |            55 | B    |
| 00034343434 | 99934343434 |     25 |            25 | A    |
| 00034343434 | 99934343434 |     25 |            25 | A    |
| 99934343434 | 99934343434 |    200 |           150 | C    |
| 88855667788 | 88855667788 |    100 |           100 | D    |
+-------------+-------------+--------+---------------+------+

I need to add another column to table above to add Actual_Type against column type. Above table can be derived by following code also provided here.

select t.*,
       (case when id like '000%' then weight
             else weight - sum(case when id like '000%' then weight else 0 end) over (partition by actual_id)
        end) as actual_weight
from (select t.*,
             max(id) over (partition by stuff(id, 1, 3, '')) as actual_id
      from t
     ) t;

I can't think of how can I apply partition by and apply max on a column Type that is varchar.

Desired Output

+-------------+-------------+--------+---------------+------+-------------+
|     Id      |  Actual ID  | Weight | Actual Weight | Type | Actual Type |   
+-------------+-------------+--------+---------------+------+-------------+
| 00011223344 | 12311223344 |     35 |            35 |  A   | B           |   
| 00011223344 | 12311223344 |     10 |            10 |  A   | B           |   
| 12311223344 | 12311223344 |    100 |            55 |  B   | B           |   
| 00034343434 | 99934343434 |     25 |            25 |  A   | C           |   
| 00034343434 | 99934343434 |     25 |            25 |  A   | C           |   
| 99934343434 | 99934343434 |    200 |           150 |  C   | C           |
| 88855667788 | 88855667788 |    100 |           100 |  D   | D           |
+-------------+-------------+--------+---------------+------+-------------+

Edit: I can not apply min or max over column Type which is nvarchar. I want to partition over column Actual ID and find the value which is unique in column Type i.e. B for 1st 3 rows. For next 3 it is C.


Solution

  • Use FIRST_VALUE() window function:

    SELECT t.Id, t.Weight, t.Type, t.actual_id, t.actual_weight,
           FIRST_VALUE(Type) OVER (PARTITION BY actual_id ORDER BY counter) AS actual_type
    FROM (
      SELECT t.*,
             CASE 
               WHEN id like '000%' THEN weight
               ELSE weight - SUM(CASE WHEN id LIKE '000%' THEN WEIGHT ELSE 0 END) OVER (PARTITION BY actual_id)
             END AS actual_weight,
             COUNT(*) OVER (PARTITION BY actual_id, Type) counter
      FROM (
        SELECT t.*,  MAX(id) OVER (PARTITION BY STUFF(id, 1, 3, '')) AS actual_id
        FROM t
      ) t
    ) t;
    

    See the demo.