Search code examples
rdplyrtidyverse

Dynamic variables from dataframe value in R with value names?


Given a dataframe of types and values like so:

topic keyword
cheese cheddar
meat beef
meat chicken
cheese swiss
bread focaccia
bread sourdough
cheese gouda

My aim is to make a set of dynamic regexs based on the type, but I don't know how to make the variable names from the types. I can do this individually like so:

fn_get_topic_regex <- function(targettopic,df)
{
  filter_df <- df |>
    filter(topic == targettopic)
  regex <- paste(filter_df$keyword, collapse =  "|")
}

and do things like:

cheese_regex <- fn_get_topic_regex("cheese",df)

But what I'd like to be able to do is build all these regexes automatically without having to define each one.

The intended output would be something like:

cheese_regex: "cheddar|swiss|gouda"
bread_regex: "focaccia|sourdough"
meat_regex: "beef|chicken"

Where the start of the variable name is the distinct topic.

What's the best way to do that without defining each regex individually by hand?


Solution

  • You can use dplyr's group_by() and summarise()

    df %>%
      group_by(topic) %>%
      summarise(regex = paste(keyword, collapse = "|"))
    
    # A tibble: 3 × 2
      topic  regex              
      <chr>  <chr>              
    1 bread  focaccia|sourdough 
    2 cheese cheddar|swiss|gouda
    3 meat   beef|chicken 
    

    Or you can apply your function to every unique value in df$topic:

    map_chr(unique(df$topic) %>% setNames(paste0(., "_regex")),
            fn_get_topic_regex, df = df)
    
             cheese_regex            meat_regex           bread_regex 
    "cheddar|swiss|gouda"        "beef|chicken"  "focaccia|sourdough"
    

    Just remember to add return(regex) to the end of your function, or not to assign the last line to a variable at all. I would even put everything in a single pipe chain:

    fn_get_topic_regex <- function(targettopic,df)
    {
      df |>
        filter(topic == targettopic) |>
        pull(keyword) |>
        paste(collapse =  "|")
    }