In my dataframe CORtrial I have two columns, rDate in POSixct
format which goes from 2015-07-27 17:45:00
until 2017-08-31 16:55:00
and REFN630 in numeric
format. A value of REFN630 is recorded in rDate in a time interval of 5 in 5 minutes.
That's the structure of my dataframe:
dput(head(CORtrial,10))
structure(list(rDate = structure(c(1438019100, 1438019400, 1438019700,
1438020000, 1438020300, 1438020600, 1438020900, 1438021200, 1438021500,
1438021800), class = c("POSIXct", "POSIXt"), tzone = ""), REFN630 = c(0.0111940298507463,
0.0671936758893281, 0.0143198090692124, 0.0087719298245614, 0.00936768149882904,
0.00985221674876847, 0.00775193798449612, 0.00815217391304348,
0.00859598853868195, 0.00911854103343465)), row.names = c(NA,
-10L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x0000024f693c1ef0>)
I want to know the max
value between a specific date and time interval, for instance, I want to know the MAXIMUM value of REFN630 between the rDate 2015-10-23 11:25:00
and 2015-10-23 11:40:00
. How can I do this?
2) After getting the maximum value throughout:
df %>%
filter(ymd_hms(rDate) %within%
interval(ymd_hms("2015-07-27 22:25:00"), ymd_hms("2015-07-27 22:50:00"))) %>%
slice_max(order_by = REFN630)
rDate REFN630
1: 2015-07-27 22:25:00 0.01431981
More columns were added to the dataframe and the structure is the following:
dput(head(COR_trial,10))
structure(list(rDate = structure(c(1438015500, 1438015800, 1438016100,
1438016400, 1438016700, 1438017000, 1438017300, 1438017600, 1438017900,
1438018200), class = c("POSIXct", "POSIXt"), tzone = ""), REFN532 = c(0.0127971,
0.1348315, 0.0215983, 0.0143443, 0.0150862, 0.0158014, 0.0167866,
0.0152284, 0.0162162, 0.0172911), REFN570 = c(0.0172414, 0.1515748,
0.0171306, 0.0149573, 0.0157303, 0.0142518, 0.0150376, 0.016,
0.0142045, 0.0151976), REFN630 = c(0.011194, 0.0671937, 0.0143198,
0.0087719, 0.0093677, 0.0098522, 0.0077519, 0.0081522, 0.008596,
0.0091185), REFN800 = c(0.0169082, 0.1030928, 0.0560472, 0.0569801,
0.0574018, 0.0573248, 0.0531561, 0.0520833, 0.0510949, 0.0498084
)), row.names = c(NA, 10L), class = "data.frame")
Now when I use the code to get the maximum of REFN630, this is what I get:
COR_trial %>%
filter(ymd_hms(rDate) %within%
interval(ymd_hms("2015-07-27 16:00:00"), ymd_hms("2015-07-27 18:00:00"))) %>%
slice_max(order_by = REFN630)
rDate REFN532 REFN570 REFN630 REFN800
1 2015-07-27 17:50:00 0.1348315 0.1515748 0.0671937 0.1030928
The desired output would be:
rDate REFN630
1 2015-07-27 17:50:00 0.0671937
How can I do this? Thanks in advance.
I think you can use the following solution. I needed to change the interval dates as the ones you mentioned on your question were not present in the data set and would lead to an empty data set after filtering. Here we first transform our preferred dates to Date
class within interval
function from lubridate
package. Then we filter our data set for only those rDate
s that fall within this interval.
library(dplyr)
library(lubridate)
df %>%
filter(ymd_hms(rDate) %within%
interval(ymd_hms("2015-07-27 22:25:00"), ymd_hms("2015-07-27 22:50:00"))) %>%
summarise(Maximum = max(REFN630, na.rm = TRUE))
Maximum
1 0.01431981
In order to have only the name of the column that contains max value:
library(tidyr)
df %>%
filter(ymd_hms(rDate) %within%
interval(ymd_hms("2015-07-27 21:15:00"), ymd_hms("2015-07-27 21:30:00"))) %>%
select(rDate, REFN630) %>%
slice_max(order_by = REFN630)
rDate REFN630
1 2015-07-27 21:20:00 0.0671937