Search code examples
rstringlistvariablesvector

create group variable by name of vector list R


Sorry if there is somewhere the answer already - I didn't found.

I do have

vector <- c("WHEA","RICE","MAIZ","BARL","PMIL","SMIL","SORG","OCER","POTA","SWPO","YAMS","CASS","ORTS","BEAN",
              "CHIC","COWP","PIGE","LENT","OPUL","SOYB","GROU","CNUT","BANA","PLNT","TROF","TEMF","VEGE","OILP",
              "SUNF","RAPE","SESA","OOIL","SUGC","SUGB","COTT","OFIB","ACOF","RCOF","COCO","TEAS","TOBA","REST")

vector_list <- list( Cereal <- vector[1:8], Roots <- vector[9:13], Pulses <- vector[14:19], oilcr <- vector[20:27], millet <- vector[5:6], cofffee <-  vector[32:33], fruit <- vector[39:40], banpl <- vector[37:38])

names(vector_list) <- c("Cereal", "Roots", "Pulses", "oilcr", "millet", "coffee", "fruit", "banpl")
Test <-  data.frame(A=vector, B = 1:42)

I want to create an new column of to Test$group. For this it should look if the df$A appears in one of the vecotrs_list vectors, if yes it shall return the name of the name of the corresponding vector of vectors list.

I've tried (for example) such an approach, but failed:

Test$group <- sapply(groups,  function (x){
  if (x %in% Test$A)  return(names(x))})

Solution

  • You can try something like the following:

    tmp <-  unlist(vector_list)
    rename_vec <- names(tmp)
    names(rename_vec) <- tmp
    
    Test$group <- rename_vec[Test$A]
    
          A  B   group
    1  WHEA  1 Cereal1
    2  RICE  2 Cereal2
    3  MAIZ  3 Cereal3
    4  BARL  4 Cereal4
    5  PMIL  5 Cereal5
    6  SMIL  6 Cereal6
    7  SORG  7 Cereal7
    8  OCER  8 Cereal8
    9  POTA  9  Roots1
    10 SWPO 10  Roots2
    11 YAMS 11  Roots3
    12 CASS 12  Roots4
    13 ORTS 13  Roots5
    

    The unlist creates a vector with the names of your group:

    unlist(vector_list)
    
    Cereal1 Cereal2 Cereal3 Cereal4 Cereal5 Cereal6 Cereal7 Cereal8  Roots1  Roots2  Roots3  Roots4  Roots5 
     "WHEA"  "RICE"  "MAIZ"  "BARL"  "PMIL"  "SMIL"  "SORG"  "OCER"  "POTA"  "SWPO"  "YAMS"  "CASS"  "ORTS" 
    Pulses1 Pulses2 Pulses3 Pulses4 Pulses5 Pulses6  oilcr1  oilcr2  oilcr3  oilcr4  oilcr5  oilcr6  oilcr7 
     "BEAN"  "CHIC"  "COWP"  "PIGE"  "LENT"  "OPUL"  "SOYB"  "GROU"  "CNUT"  "BANA"  "PLNT"  "TROF"  "TEMF" 
     oilcr8 millet1 millet2 coffee1 coffee2  fruit1  fruit2  banpl1  banpl2 
     "VEGE"  "PMIL"  "SMIL"  "OOIL"  "SUGC"  "COCO"  "TEAS"  "ACOF"  "RCOF" 
    

    You then want to inverse the name and the component of the vector, so that you can use Test$A to select the proper components of your vector which will give you the group:

    rename_vec <- names(tmp)
    names(rename_vec) <- tmp
    rename_vec
    
    
         WHEA      RICE      MAIZ      BARL      PMIL      SMIL      SORG      OCER      POTA      SWPO      YAMS 
    "Cereal1" "Cereal2" "Cereal3" "Cereal4" "Cereal5" "Cereal6" "Cereal7" "Cereal8"  "Roots1"  "Roots2"  "Roots3" 
         CASS      ORTS      BEAN      CHIC      COWP      PIGE      LENT      OPUL      SOYB      GROU      CNUT 
     "Roots4"  "Roots5" "Pulses1" "Pulses2" "Pulses3" "Pulses4" "Pulses5" "Pulses6"  "oilcr1"  "oilcr2"  "oilcr3" 
         BANA      PLNT      TROF      TEMF      VEGE      PMIL      SMIL      OOIL      SUGC      COCO      TEAS 
     "oilcr4"  "oilcr5"  "oilcr6"  "oilcr7"  "oilcr8" "millet1" "millet2" "coffee1" "coffee2"  "fruit1"  "fruit2" 
         ACOF      RCOF 
     "banpl1"  "banpl2"