Search code examples
rmergematchingcalculated-columnsrefer

How to refer from a column to names of other columns and create a new column


I have a table such as

+---------+---------+--------+--------+--------+
| Product | Classif | Type 1 | Type 2 | Type 3 |
+---------+---------+--------+--------+--------+
| a       | Type 1  |      2 |      6 |      8 |
| b       | Type 2  |      3 |      9 |     11 |
| c       | Type 3  |      5 |     10 |     15 |
+---------+---------+--------+--------+--------+

Where I have a list of products and the classification they have. The matching between a product and a classification is enough to determine their price (which is in columns 3 to 5). I would like a new column that shows the price of each product according to its type, such as:

+---------+---------+--------+--------+--------+-------+
| Product | Classif | Type 1 | Type 2 | Type 3 | Price |
+---------+---------+--------+--------+--------+-------+
| a       | Type 1  |      2 |      6 |      8 |     2 |
| b       | Type 2  |      3 |      9 |     11 |     9 |
| c       | Type 3  |      5 |     10 |     15 |    15 |
+---------+---------+--------+--------+--------+-------+

Where the program compares the value of the column classif, and takes the value from the corresponding column.


Solution

  • What you look for can be reached reshaping your data first to long and then compute the comparison to obtain the price in order to join all together with left_join(). Here the code using tidyverse functions:

    library(tidyverse)
    #Code
    df2 <- df %>% left_join(df %>% pivot_longer(-c(Product,Classif)) %>%
      mutate(Price=ifelse(Classif==name,value,NA)) %>%
      filter(!is.na(Price)) %>% select(-c(name,value)))
    

    Output:

      Product Classif Type 1 Type 2 Type 3 Price
    1       a  Type 1      2      6      8     2
    2       b  Type 2      3      9     11     9
    3       c  Type 3      5     10     15    15
    

    Some data used:

    #Data
    df <- structure(list(Product = c("a", "b", "c"), Classif = c("Type 1", 
    "Type 2", "Type 3"), `Type 1` = c(2, 3, 5), `Type 2` = c(6, 9, 
    10), `Type 3` = c(8, 11, 15)), row.names = c(NA, -3L), class = "data.frame")