Search code examples
sqlsql-servert-sqlssissql-server-data-tools

How can I create header records by taking values from one of several line items?


I have a set of sorted line items. They are sorted first by ID then by Date:

| ID  | DESCRIPTION |   Date   |
| --- | ----------- |----------|
| 100 | Red         |2019-01-01|
| 101 | White       |2019-01-01|
| 101 | White_v2    |2019-02-01|
| 102 | Red_Trim    |2019-01-15|
| 102 | White       |2019-01-16|
| 102 | Blue        |2019-01-20|
| 103 | Red_v3      |2019-01-14|
| 103 | Red_v3      |2019-03-14|

I need to insert rows in a SQL Server table, which represents a project header, so that the first row for each ID provides the Description and Date in the destination table. There should only be one row in the destination table for each ID.

For example, the source table above would result in this at the destination:

| ID  | DESCRIPTION |   Date   |
| --- | ----------- |----------|
| 100 | Red         |2019-01-01|
| 101 | White       |2019-01-01|
| 102 | Red_Trim    |2019-01-15|
| 103 | Red_v3      |2019-01-14|

How do I collapse the source so that I take only the first row for each ID from source?

I prefer to do this with a transformation in SSIS but can use SQL if necessary. Actually, solutions for both methods would be most helpful.


This question is distinct from Trouble using ROW_NUMBER() OVER (PARTITION BY …) in that this seeks to identify an approach. The asker of that question has adopted one approach, of more than one available as identified by answers here. That question is about how to make that particular approach work.


Solution

  • use first_value window function

    select * from (select *,
    first_value(DESCRIPTION) over(partition by id order by Date) as des,
    row_number() over(partition by id order by Date) rn
    from table
    ) a where a.rn =1