I'm trying to subset data that falls between sunset and sunrise. The data:
library(tidyverse)
library(lubridate)
library(suncalc)
dat <- tibble(datetime = seq(as.POSIXct('2020-08-03 00:00:00'),
as.POSIXct('2020-08-09 12:00:00'),
by=3600),
var1 = rnorm(157,2,1),
var2 = rnorm(157,3,5)) %>%
mutate(getSunlightTimes(date = as.Date(datetime, format = '%m/%d/%Y'),
lat = 43.1, lon = -76.2, tz = 'America/New_York',
keep = c('sunrise', 'sunset'))) %>%
select(c(datetime, var1, var2, sunrise, sunset))
Then I would like to subset the data so that I only retain rows where datetime
falls between sunrise and sunset on a given day. I tried:
myrange <- as.interval(unique(dat$sunrise), unique(dat$sunset))
dat <- dat %>%
filter(datetime %within% myrange)
This compiles but throws warnings and does not include all the rows of data that it should. Thanks in advance.
Try this:
First, create a partial dat
. I'll add date
, since we need it both for the calculations and for joining the data back in.
set.seed(42)
dat <- tibble(datetime = seq(as.POSIXct('2020-08-03 00:00:00'),
as.POSIXct('2020-08-09 12:00:00'),
by=3600),
var1 = rnorm(157,2,1),
var2 = rnorm(157,3,5)) %>%
mutate(date = as.Date(datetime))
dat
# # A tibble: 157 x 4
# datetime var1 var2 date
# <dttm> <dbl> <dbl> <date>
# 1 2020-08-03 00:00:00 3.37 -1.00 2020-08-03
# 2 2020-08-03 01:00:00 1.44 0.333 2020-08-03
# 3 2020-08-03 02:00:00 2.36 9.44 2020-08-03
# 4 2020-08-03 03:00:00 2.63 2.12 2020-08-03
# 5 2020-08-03 04:00:00 2.40 -2.36 2020-08-03
# 6 2020-08-03 05:00:00 1.89 3.82 2020-08-03
# 7 2020-08-03 06:00:00 3.51 1.19 2020-08-03
# 8 2020-08-03 07:00:00 1.91 5.95 2020-08-03
# 9 2020-08-03 08:00:00 4.02 10.2 2020-08-03
# 10 2020-08-03 09:00:00 1.94 -1.96 2020-08-03
# # ... with 147 more rows
Now reduce that to the distinct date
s found, get the sunrise/sunset for those dates, and join back in to the original dat
(with left_join
). After that, we can filter
the times we need.
distinct(dat, date) %>%
with(., getSunlightTimes(date = date,
lat = 43.1, lon = -76.2, tz = 'America/New_York',
keep = c('sunrise', 'sunset'))) %>%
left_join(dat, ., by = "date") %>%
filter(sunrise <= datetime, datetime <= sunset)
# # A tibble: 85 x 8
# datetime var1 var2 date lat lon sunrise sunset
# <dttm> <dbl> <dbl> <date> <dbl> <dbl> <dttm> <dttm>
# 1 2020-08-03 06:00:00 3.51 1.19 2020-08-03 43.1 -76.2 2020-08-03 05:59:13 2020-08-03 20:25:11
# 2 2020-08-03 07:00:00 1.91 5.95 2020-08-03 43.1 -76.2 2020-08-03 05:59:13 2020-08-03 20:25:11
# 3 2020-08-03 08:00:00 4.02 10.2 2020-08-03 43.1 -76.2 2020-08-03 05:59:13 2020-08-03 20:25:11
# 4 2020-08-03 09:00:00 1.94 -1.96 2020-08-03 43.1 -76.2 2020-08-03 05:59:13 2020-08-03 20:25:11
# 5 2020-08-03 10:00:00 3.30 5.27 2020-08-03 43.1 -76.2 2020-08-03 05:59:13 2020-08-03 20:25:11
# 6 2020-08-03 11:00:00 4.29 3.42 2020-08-03 43.1 -76.2 2020-08-03 05:59:13 2020-08-03 20:25:11
# 7 2020-08-03 12:00:00 0.611 7.48 2020-08-03 43.1 -76.2 2020-08-03 05:59:13 2020-08-03 20:25:11
# 8 2020-08-03 13:00:00 1.72 1.85 2020-08-03 43.1 -76.2 2020-08-03 05:59:13 2020-08-03 20:25:11
# 9 2020-08-03 14:00:00 1.87 7.18 2020-08-03 43.1 -76.2 2020-08-03 05:59:13 2020-08-03 20:25:11
# 10 2020-08-03 15:00:00 2.64 -5.73 2020-08-03 43.1 -76.2 2020-08-03 05:59:13 2020-08-03 20:25:11
# # ... with 75 more rows
Note that we cannot use dplyr::between
here, since that function only uses the first of its left
and right
arguments.