Search code examples
sqloracle-databaseaggregate-functions

Using Min and Max to remove duplicate rows, and how to handle null values


I am creating an SQL query in Oracle and I am trying to remove duplicate rows with a min and max aggregate function in a case statement. Here is the code at its current state:

select 
   Student_number,
       case 
       when min(sr.racecd) = max(sr.racecd) then min(sr.racecd) else 'Two or more races' 
   end as races               

this is what the output looks like

Student Number   Race
4322             two or more races
4324             White

When I run the code it combines multiple rows into one and changes the name to 'two or more races'. But, the problem I am having is when it runs into a null value it changes it to 'Two or more races', too. How can I keep the Nulls as is, or change them to unknown? Also, when I add other columns in to the query the aggregate function does not work the same as when I am querying only studentnumber and racecd, why is that?


Solution

  • Because NULL does not equal anything, so failing the equality test it goes to the THEN clause. There are a number of solutions. One is to use COUNT instead which ignores NULLs, like this:

    select 
       Student_number,
           CASE WHEN (COUNT(DISTINCT sr.raced) > 1) THEN 'Two or more races'
                ELSE MAX(sr.racecd)
           END
       end as races   
    

    As far as the issue with adding columns, when you add columns that are not aggregated you will be forced to include them in your GROUP BY. That changes the granularity of your query and the rows included in each group, so it will impact the result. If you want more information about student besides student_number, you'll want to aggregate other columns (e.g. with MAX() ).