Search code examples
rpurrrrowwise

Use purrr to calculate sunrise time for each row in R


I am new to purrr and struggling to understand how to append the result of my function onto my dataframe (and get the best performance, since my dataframe is large).

I'm attempting to calculate sunrise time for each row in a dataframe:

library(tidyverse)
library(StreamMetabolism)

test <- structure(list(Latitude = c(44.49845, 42.95268, 42.95268, 44.49845,
44.49845, 44.49845), Longitude = c(-78.19259, -81.36935, -81.36935, -78.19259,
-78.19259, -78.19259), date = c("2014/02/12", "2014/01/24", "2014/01/08",
"2014/01/11", "2014/01/10", "2014/01/07"), timezone = c("EST5EDT", "EST5EDT",
"EST5EDT", "EST5EDT", "EST5EDT", "EST5EDT")), class = c("tbl_df", "tbl",
"data.frame"), row.names = c(NA, -6L))

sunRise <- function(Latitude, Longitude, date, timezone){
  print(sunrise.set(Latitude, Longitude, date, timezone, num.days = 1)[1,1])
}

I got this far, which gets me the desired sunrise times:

test %>% 
  pwalk(sunRise)

[1] "2014-02-12 07:17:09 EST"
[1] "2014-01-24 07:47:55 EST"
[1] "2014-01-08 07:56:13 EST"
[1] "2014-01-11 07:47:38 EST"
[1] "2014-01-10 07:47:59 EST"
[1] "2014-01-07 07:48:48 EST"

But I can't seem to figure out how to get the results of my function appended on to the end of the "test" dataframe, say as another variable called "sunrise_time"...

test %>% 
  mutate(sunrisetime = pwalk(sunRise))

Error in mutate_impl(.data, dots) : Evaluation error: argument ".f" is missing, with no default.

Sidebar: if you can recommend a good purrr tutorial that worked for you, please include it in your answer!! There seems to be a lot to know about purrr and I'm not sure what to focus on as a first-timer.


Solution

  • You don't really need purrr here. Here's a dplyr approach:

    library(dplyr)
    library(StreamMetabolism)
    
    # updated function
    sunRise <- function(Latitude, Longitude, date, timezone){
      sunrise.set(Latitude, Longitude, date, timezone, num.days = 1)[1,1]
    }
    
    test %>%
      rowwise() %>%
      mutate(sunrize_time = sunRise(Latitude, Longitude, date, timezone)) %>%
      ungroup()
    
    # # A tibble: 6 x 5
    #   Latitude Longitude date       timezone sunrize_time                  
    #      <dbl>     <dbl> <chr>      <chr>    <dttm>             
    # 1     44.5     -78.2 2014/02/12 EST5EDT  2014-02-12 07:17:09
    # 2     43.0     -81.4 2014/01/24 EST5EDT  2014-01-24 07:47:55
    # 3     43.0     -81.4 2014/01/08 EST5EDT  2014-01-08 07:56:13
    # 4     44.5     -78.2 2014/01/11 EST5EDT  2014-01-11 07:47:38
    # 5     44.5     -78.2 2014/01/10 EST5EDT  2014-01-10 07:47:59
    # 6     44.5     -78.2 2014/01/07 EST5EDT  2014-01-07 07:48:48
    

    Or if you want to use purr you can do:

    library(tidyverse)
    
    test %>%
      group_by(id = row_number()) %>%
      nest() %>%
      mutate(sunrise_time = map(data, ~sunRise(.x$Latitude, .x$Longitude, .x$date, .x$timezone))) %>%
      unnest()
    
    # # A tibble: 6 x 6
    #      id sunrise_time        Latitude Longitude date       timezone
    #   <int> <dttm>                 <dbl>     <dbl> <chr>      <chr>   
    # 1     1 2014-02-12 07:17:09     44.5     -78.2 2014/02/12 EST5EDT 
    # 2     2 2014-01-24 07:47:55     43.0     -81.4 2014/01/24 EST5EDT 
    # 3     3 2014-01-08 07:56:13     43.0     -81.4 2014/01/08 EST5EDT 
    # 4     4 2014-01-11 07:47:38     44.5     -78.2 2014/01/11 EST5EDT 
    # 5     5 2014-01-10 07:47:59     44.5     -78.2 2014/01/10 EST5EDT 
    # 6     6 2014-01-07 07:48:48     44.5     -78.2 2014/01/07 EST5EDT 
    

    You can remove the id column if you want.

    Or, you can slightly change your function and do this:

    # update function
    sunRise <- function(Latitude, Longitude, date, timezone){
      return(list(sunrise_time = sunrise.set(Latitude, Longitude, date, timezone, num.days = 1)[1,1]))
    }
    
    # apply function to each row and create a dataframe
    # bind columns with original dataset
    pmap_df(test, sunRise) %>%
      cbind(test, .)
    
    #   Latitude Longitude       date timezone        sunrise_time
    # 1 44.49845 -78.19259 2014/02/12  EST5EDT 2014-02-12 07:17:09
    # 2 42.95268 -81.36935 2014/01/24  EST5EDT 2014-01-24 07:47:55
    # 3 42.95268 -81.36935 2014/01/08  EST5EDT 2014-01-08 07:56:13
    # 4 44.49845 -78.19259 2014/01/11  EST5EDT 2014-01-11 07:47:38
    # 5 44.49845 -78.19259 2014/01/10  EST5EDT 2014-01-10 07:47:59
    # 6 44.49845 -78.19259 2014/01/07  EST5EDT 2014-01-07 07:48:48