I have a dataframe like the following:
Country A B C
HONDURAS PL Partido Liberal 12.00000
HONDURAS PN Partido Nacional de Honduras 17.33000
NICARAGUA PN Partido Nacional 12.00000
CHILE PN Partido Nacional 17.33000
HONDURAS PNH Partido Nacional 17.33000
HONDURAS PNH Partido Nacional de Honduras 17.33000
I am performing a merge with another dataframe, and in order to do so, I need to take only unique parties. Therefore, from this section of the dataframe, I need to take only 1 of these "HONDURAS Partido Nacional" duplicates as the unique observation (it doesn't matter which one, first occrance would suffice). The problem is some parties have the same abbreviation or "C" value. So, I need only unique observations of equivalents within country. The desired output would therefore be the following:
Country A B C
HONDURAS PL Partido Liberal 12.00000
HONDURAS PN Partido Nacional de Honduras 17.33000
NICARAGUA PN Partido Nacional 12.00000
CHILE PN Partido Nacional 17.33000
This way, even though the Nicaragua and Honduras "Partido Nacional"s have the same abbreviation, they are both retained because they are different countries and even though Chile and Honduras's "Partido Nacional"s have the same "C" value, they are retained because the country is different. Essentially, I need unique observations of when Country-A-C match (but there are multiples of B for that set), or when Country-B-C match (but there are multiples of A for that set).
Try with distinct
library(dplyr)
df1 %>%
distinct(Country, C, .keep_all = TRUE)
-output
Country A B C
1 HONDURAS PL Partido Liberal 12.00
2 HONDURAS PN Partido Nacional de Honduras 17.33
3 NICARAGUA PN Partido Nacional 12.00
4 CHILE PN Partido Nacional 17.33