Search code examples
sqlsql-serverwhere-clause

Deleting observations based on certain conditions in SQL


I currently have the dataset below:

Group Start End
A 2021-01-01 2021-04-05
A 2021-01-01 2021-06-05
A 2021-03-01 2021-06-05
B 2021-06-13 2021-08-05
B 2021-06-13 2021-09-05
B 2021-07-01 2021-09-05
C 2021-10-07 2021-10-17
C 2021-10-07 2021-11-15
C 2021-11-12 2021-11-15

I want like the following final dataset: Essentially, I would like to remove all observations that don't equal the minimum start value and I want to do this by group.

Group Start End
A 2021-01-01 2021-04-05
A 2021-01-01 2021-06-05
B 2021-06-13 2021-08-05
B 2021-06-13 2021-09-05
C 2021-10-07 2021-10-17
C 2021-10-07 2021-11-15

I tried the following code but I cannot do a min statement in a where clause. Any help would be appreciated.

Delete from #df1
where start != min(start)

Solution

  • If you want to remove all rows, that have not the same [start] you can join a subquery which find the earliest day, you can add additional ON clauses if you need to find other rows as well

    DELETE
     o1
     FROM observations o1 
     INNER JOIN(SELECT MIN([Start]) minstart , [Group] FROM observations GROUP BY [Group] ) o2
     ON o1.[Group] = o2.[Group] AND o1.[Start] <> o2.minstart
    
    SELECT *
     FROM observations
    
    Group | Start      | End       
    :---- | :--------- | :---------
    A     | 2021-01-01 | 2021-04-05
    A     | 2021-01-01 | 2021-06-05
    B     | 2021-06-13 | 2021-08-05
    B     | 2021-06-13 | 2021-09-05
    C     | 2021-10-07 | 2021-10-17
    C     | 2021-10-07 | 2021-11-15
    

    db<>fiddle here