Search code examples
rdataframefor-loopiterator

Looping through 2 columns and all rows in R and forming 2 new column based on their values


I've got an R Dataframe df with two columns Country1 and Country2 and a certain number of rows. I also have two vectors Continent_Europe and Continent_Africa. Every element of Country1 and Country2 is also an element of one, and only one of the two vectors Continent_Europe and Continent_Africa.

I'd like to create two new columns called Country1_Continent and Country2_Continentin df that would specify whether corresponding elements ofCountry1 and Country2, are in Europe or Africa.

In other words, based on whether each element of my two dataframe columns belongs to Continent_Europe or Continent_Africa, I'd like to assign them values Europe or Africa respectively, and form two new columns containing this information.

Here is a working example of my df

Country1 Country2
1   France  Austria
2    Spain  Nigeria
3  Nigeria    Italy
4  Austria  Nigeria
5 Cameroon   France
6   France    Spain

My two vectors are:

Continent_Europe = c("Spain",  "Austria", "France", "Italy", "Germany", "Denmark")

Continent_Africa = c("Nigeria", "Cameroon", "Botswana", "Angola")

Desired output:

 Country1 Country2 Country1_Continent Country2_Continent
1   France  Austria             Europe             Europe
2    Spain  Nigeria             Europe             Africa
3  Nigeria    Italy             Africa             Europe
4  Austria  Nigeria             Europe             Africa
5 Cameroon   France             Africa             Europe
6   France    Spain             Europe             Europe

How can I implement this?


Solution

  • Here is an option if you want to use a loop:

    for (col in c("Country1", "Country2")) {
      for (cont in names(continent)) {
        df[df[[col]] %in% continent[[cont]], paste0(col, "_Continent")] <- cont
      }
    }
    #   Country1 Country2 Country1_Continent Country2_Continent
    # 1   France  Austria             Europe             Europe
    # 2    Spain  Nigeria             Europe             Africa
    # 3  Nigeria    Italy             Africa             Europe
    # 4  Austria  Nigeria             Europe             Africa
    # 5 Cameroon   France             Africa             Europe
    # 6   France    Spain             Europe             Europe
    

    Data

    df <- data.frame(
      Country1 = c("France", "Spain", "Nigeria", "Austria", "Cameroon", "France"),
      Country2 = c("Austria", "Nigeria", "Italy", "Nigeria", "France", "Spain")
    )
    continent <- list(
      Europe = c("Spain",  "Austria", "France", "Italy", "Germany", "Denmark"),
      Africa = c("Nigeria", "Cameroon", "Botswana", "Angola")  
    )