Search code examples
rtidyversepurrrseq

Why does tidyverse::map not work with two lists in one tibble?


I am trying to count how many days in column dias_trabajo are in dias_evaluar.

library(tidyverse)
library(lubridate)

This is the mininal reprex:

tibble(
  dias_trabajo = list(seq(ymd("2021-01-01"), ymd("2021-01-22"), by = "day"), seq(ymd("2021-01-04"), ymd("2021-01-22"), by = "day")),
  dias_evaluar = list(seq(ymd("2021-01-01"), ymd("2021-01-07"), by = "day"))
) %>% 
  mutate(
    trabajo = map(dias_trabajo, function(x) x %in% dias_evaluar) %>% map_int(sum)
  )

The above code gives zeros at trabajo.

# A tibble: 2 x 3
  dias_trabajo dias_evaluar trabajo
  <list>       <list>         <int>
1 <date [22]>  <date [7]>         0
2 <date [19]>  <date [7]>         0

I expect the column trabajo to be: first row: 7, second row: 4.

I tried with only one line and it works:

seq(ymd("2021-01-01"), ymd("2021-01-22"), by = "day") %in% seq(ymd("2021-01-01"), ymd("2021-01-07"), by = "day") %>% sum()

gives. It is the expected result for the first row.

[1] 7

Solution

  • Since you have two lists as input dias_trabajo and dias_evaluar you need to use map2

    library(tidyverse)
    library(lubridate)
    
      
    tb %>% 
      mutate(
        trabajo = map2_dbl(.x = dias_trabajo, .y = dias_evaluar, ~sum(.x %in% .y))
        )
    
    # A tibble: 2 x 3
      dias_trabajo dias_evaluar trabajo
      <list>       <list>         <int>
    1 <date [22]>  <date [7]>         7
    2 <date [19]>  <date [7]>         4
    

    data:

    tb <- tibble(
            dias_trabajo = list(seq(ymd("2021-01-01"), ymd("2021-01-22"), by = "day"), seq(ymd("2021-01-04"), ymd("2021-01-22"), by = "day")),
            dias_evaluar = list(seq(ymd("2021-01-01"), ymd("2021-01-07"), by = "day"))
          )