Search code examples
rfunctiondplyrtibble

Problem with mutate keyword and functions in R


I got a problem with the use of MUTATE, please check the next code block.

output1 <- mytibble %>% 
  mutate(newfield = FND(mytibble$ndoc)) 
output1

Where FND function is a FILTER applied to a large file (5GB):

FND <- function(n){
  result <- LARGETIBBLE %>% filter(LARGETIBBLE$id == n)
  return(paste(unique(result$somefield),collapse=" "))
}

I want to execute FND function for each row of output1 tibble, but it just executes one time.


Solution

  • FND(mytibble$ndoc) is more suitable for data frames. When you use functions such as mutate on a tibble, there is no need to specify the name of the tibble, only that of the column. The symbols %>% are already making sure that only data from the tibble is used. Thus your example would be:

    
    output1 <- mytibble %>% 
      mutate(newfield = FND(ndoc)) 
    
    FND <- function(n){
      result <- LARGETIBBLE %>% filter(id == n)
      return(paste(unique(result$somefield),collapse=" "))
    }
    
    

    This would be theoretically, however I do not know if your function FND will work, maybe try it and if not, give some practical example with data and what you are trying to achieve.