Search code examples
rpurrrrowwise

Apply function over data frame rows


I'm trying to apply a function over the rows of a data frame and return a value based on the value of each element in a column. I'd prefer to pass the whole dataframe instead of naming each variable as the actual code has many variables - this is a simple example.

I've tried purrr map_dbl and rowwise but can't get either to work. Any suggestions please?

#sample df
df <- data.frame(Y=c("A","B","B","A","B"),
                  X=c(1,5,8,23,31))

#required result
Res <- data.frame(Y=c("A","B","B","A","B"),
                  X=c(1,5,8,23,31),
                  NewVal=c(10,500,800,230,3100)
                  )

#use mutate and map or rowwise etc
Res <- df %>%
  mutate(NewVal=map_dbl(.x=.,.f=FnAdd(.)))

Res <- df %>%
  rowwise() %>% 
  mutate(NewVal=FnAdd(.))


#sample fn
FnAdd <- function(Data){

  if(Data$Y=="A"){
    X=Data$X*10
  }  

  if(Data$Y=="B"){
    X=Data$X*100
  } 
  return(X)
}

Solution

  • If there are multiple values, it is better to have a key/val dataset, join and then do the mulitiplication

    keyVal <- data.frame(Y = c("A", "B"), NewVal = c(10, 100))
    df %>%
       left_join(keyVal) %>%
       mutate(NewVal = X*NewVal)
    #  Y  X NewVal
    #1 A  1     10
    #2 B  5    500
    #3 B  8    800
    #4 A 23    230
    #5 B 31   3100
    

    It is not clear how many unique values are there in the actual dataset 'Y' column. If we have only a few values, then case_when can be used

    FnAdd <- function(Data){
       Data %>%
          mutate(NewVal = case_when(Y == "A" ~ X * 10,
                                    Y == "B" ~ X *100,
                                    TRUE ~ X)) 
    }
    
    FnAdd(df)
    #   Y  X NewVal
    #1 A  1     10
    #2 B  5    500
    #3 B  8    800
    #4 A 23    230
    #5 B 31   3100