Search code examples
rtidyversepurrrmapr

Map function over nested columns


I was wondering if I could get some help with the code below. I would like to create a new list column with the results of my peaksizefunction. I think the error may be miss using the map function?

library(pracma)##for findpeaks function
library(tidyverse)

example<-structure(list(Year = c(1, 2, 3), data = list(lv2 = structure(list(
  Day = 1:10, Monkeys = c(1000, 1084.79382617268, 1128.7320290261, 1192.5408323771, 1262.24997394777, 1381.3734051781, 1172.35788735297,  861.283169243595, 635.851700647185, 499.355928090663)), class = "data.frame", row.names = c(NA, -10L)), lv2 = structure(list(Day = 1:10, Monkeys = c(1000, 1124.15225852655, 1230.01336979967, 1341.80600037997, 1429.92959384224, 1470.88040331676, 1643.24202598493, 1812.51355499621, 1946.67392606854, 2134.80778581273)), class = "data.frame", row.names = c(NA, -10L)), lv2 = structure(list(Day = 1:10, Monkeys= c(1000, 1116.20746967752, 1218.98113158501, 1110.34866544069, 1212.18568925552, 1023.59978403131, 736.569202268108, 480.000410980056, 504.039087330224, 529.611440434258)), class = "data.frame", row.names = c(NA, -10L)))), class = c("rowwise_df", "tbl_df", "tbl", "data.frame"), row.names = c(NA, -3L), groups = structure(list(Year = c(1, 2, 3), data = list(lv2 = structure(list(Day = 1:10, Monkeys = c(1000,1084.79382617268, 1128.7320290261, 1192.5408323771, 1262.24997394777, 1381.3734051781, 1172.35788735297, 861.283169243595, 635.851700647185, 499.355928090663)), class = "data.frame", row.names = c(NA, -10L)), lv2 = structure(list(Day = 1:10, Monkeys = c(1000, 1124.15225852655, 1230.01336979967, 1341.80600037997, 1429.92959384224, 1470.88040331676,1643.24202598493, 1812.51355499621, 1946.67392606854, 2134.80778581273)), class = "data.frame", row.names = c(NA, -10L)), lv2 = structure(list(Day = 1:10, Monkeys = c(1000, 1116.20746967752, 1218.98113158501, 1110.34866544069, 1212.18568925552, 1023.59978403131, 736.569202268108, 480.000410980056, 504.039087330224, 529.611440434258)), class = "data.frame", row.names = c(NA, -10L))), .rows = structure(list(1L, 2L, 3L), ptype = integer(0), class = c("vctrs_list_of", "vctrs_vctr", "list"))), row.names = c(NA, -3L), class = c("tbl_df", "tbl", "data.frame")))


peaksizefunction<-function(x){
  peaksize<- if(is.null(findpeaks(x))== T){peaksize<- matrix(c(0,0,0,0),nrow = 1)}else{peaksize<-findpeaks(x)}
  return(peaksize)
}
##my attempt
testdf2<- example %>%
  mutate(Peaks = map(.x= example$data,.f= ~peaksizefunction(x= .x$Monkeys)))

Error in `mutate()`: ! Problem while computing `Peaks = map(.x = example$data, .f = ~peaksizefunction(x = .x$Pest))`. x `Peaks` must be size 1, not 3. i Did you mean: `Peaks = list(map(.x = example$data, .f = ~peaksizefunction(x = .x$Pest)))` ? i The error occurred in row 1. Run `rlang::last_error()` to see where the error occurred.

Thanks in advance, Stuart


Solution

  • You can try like this:

    example %>%
      group_by(Year) %>% 
      mutate(Peaks= map(data, ~peaksizefunction(.x$Monkeys)))
    

    Output:

       Year data          Peaks        
      <dbl> <named list>  <named list> 
    1     1 <df [10 × 2]> <dbl [1 × 4]>
    2     2 <df [10 × 2]> <dbl [1 × 4]>
    3     3 <df [10 × 2]> <dbl [2 × 4]>
    

    Update, based on OP's query in the comments, below:

    If you would like the function to return a named list, rather than a matrix, you could update the function like this:

    peaksizefunction<-function(x){
      peaksize<- if(is.null(findpeaks(x))== T){peaksize<- matrix(c(0,0,0,0),nrow = 1)}else{peaksize<-findpeaks(x)}
      return(setNames(asplit(peaksize,2),c("Size","Start","End","Time")))
    }
    

    Then, when you apply as above, like below, but unnest_wider() and then unnest(), you get the following output

    example %>%
      group_by(Year) %>% 
      mutate(Peaks= map(data, ~peaksizefunction(.x$Monkeys))) %>% 
      unnest_wider(Peaks) %>% 
      unnest(cols = Size:Time)
    
       Year data           Size Start   End  Time
      <dbl> <named list>  <dbl> <dbl> <dbl> <dbl>
    1     1 <df [10 x 2]> 1381.     6     1    10
    2     2 <df [10 x 2]>    0      0     0     0
    3     3 <df [10 x 2]> 1219.     3     1     4
    4     3 <df [10 x 2]> 1212.     5     4     8