Search code examples
rdataframeunique

R Extract unique rows conditional on other columns' differences


I have a dataframe like the following:

Country   A        B                         C

HONDURAS PL  Partido Liberal              12.00000

HONDURAS PN  Partido Nacional de Honduras 17.33000

NICARAGUA PN Partido Nacional             12.00000

CHILE    PN  Partido Nacional             17.33000

HONDURAS PNH Partido Nacional             17.33000

HONDURAS PNH Partido Nacional de Honduras 17.33000

I am performing a merge with another dataframe, and in order to do so, I need to take only unique parties. Therefore, from this section of the dataframe, I need to take only 1 of these "HONDURAS Partido Nacional" duplicates as the unique observation (it doesn't matter which one, first occrance would suffice). The problem is some parties have the same abbreviation or "C" value. So, I need only unique observations of equivalents within country. The desired output would therefore be the following:

Country   A        B                         C

HONDURAS PL  Partido Liberal              12.00000

HONDURAS PN  Partido Nacional de Honduras 17.33000

NICARAGUA PN Partido Nacional             12.00000

CHILE    PN  Partido Nacional             17.33000

This way, even though the Nicaragua and Honduras "Partido Nacional"s have the same abbreviation, they are both retained because they are different countries and even though Chile and Honduras's "Partido Nacional"s have the same "C" value, they are retained because the country is different. Essentially, I need unique observations of when Country-A-C match (but there are multiples of B for that set), or when Country-B-C match (but there are multiples of A for that set).


Solution

  • Try with distinct

    library(dplyr)
    df1 %>% 
      distinct(Country, C, .keep_all = TRUE)
    

    -output

       Country  A                            B     C
    1  HONDURAS PL              Partido Liberal 12.00
    2  HONDURAS PN Partido Nacional de Honduras 17.33
    3 NICARAGUA PN             Partido Nacional 12.00
    4     CHILE PN             Partido Nacional 17.33