Search code examples
rvectorindices

R - if column value matches vector item, take value from second vector


I have the following table:

library( tidyverse )
data = read.table(text="gene1
           gene2
           gene3", , sep="\t", col.names = c("Protein"))

And the following two lists:

genes = c("gene1", "gene3")
genes_names = c("name1", "name3")

Each item in gene_names corresponds to each item in genes with the same index.

Now, I want to make a new column in data called ToLabel, that holds the item in gene_names if the column value in data$Protein matches genes.

data %>% mutate( ToLabel = ifelse( Protein %in% genes, genes_names, "no" ) )

This does not work as expected. My expected outcome:

Protein ToLabel
gene1   name1
gene2   no
gene3   name3

Solution

  • Use a join if we want to replace multiple values by matching

    library(dplyr)
    data %>%
       mutate(Protein = trimws(Protein)) %>% 
       left_join(tibble(Protein = genes, ToLabel = genes_names)) %>%
       mutate(ToLabel = coalesce(ToLabel, "no"))
    

    -output

      Protein ToLabel
    1   gene1   name1
    2   gene2      no
    3   gene3   name3