I would like to keep all cases in a list that reach an individual cut-off value. The list contains of 650 individual cross correlations from the ccf-function.
str(d_posts_ccf_10_10)
List of 650
$ Europa_Teles_BTR :List of 6
..$ acf : num [1:61, 1, 1] 0.01628 -0.00581 -0.04069 -0.16275 0.35689 ...
..$ type : chr "correlation"
..$ n.used: int 148
..$ lag : num [1:61, 1, 1] -30 -29 -28 -27 -26 -25 -24 -23 -22 -21 ...
..$ series: chr "X"
..$ snames: chr ".$Posts_game & .$Posts_stop"
..- attr(*, "class")= chr "acf"
$ Polonpo :List of 6
..$ acf : num [1:61, 1, 1] -0.05826 0.13355 -0.06989 -0.00596 -0.05827 ...
..$ type : chr "correlation"
..$ n.used: int 127
..$ lag : num [1:61, 1, 1] -30 -29 -28 -27 -26 -25 -24 -23 -22 -21 ...
..$ series: chr "X"
..$ snames: chr ".$Posts_game & .$Posts_stop"
..- attr(*, "class")= chr "acf"
$ derchefz :List of 6
..$ acf : num [1:61, 1, 1] 0 0.0587 -0.0744 0.2663 -0.268 ...
..$ type : chr "correlation"
..$ n.used: int 143
..$ lag : num [1:61, 1, 1] -30 -29 -28 -27 -26 -25 -24 -23 -22 -21 ...
..$ series: chr "X"
..$ snames: chr ".$Posts_game & .$Posts_stop"
..- attr(*, "class")= chr "acf"
Every case has its own used observations. I am interested in the ACF-values and I would like to keep all cases, where at least one ACF-value is "±2/√T where T is the length of the time series" ( I guess n.used). Reason for this procedure is that I would like to get all significant lags without the visual inspection of the ACF plot, since its about 650 cases. Really appreciate some help or advise on this one!
library(purrr)
test_499 <- d_posts_ccf_10_10
%>% keep(.x$acf < 2/sqrt(x$n.used))
%>% keep(.x$acf > -2/sqrt(x$n.used))
test_500 <- d_posts_ccf_10_10 %>% map(~ .x$acf) %>%
keep(function(x) x > 2/sqrt(.x$n.used))
There may be more elegant solutions, but one base R approach may be to first establish your list-specific cutoff values, then combine lapply
and sapply
to create a boolean vector indicating if any values fall within the threshold.
Its difficult to ensure this will work with your exact data without reproducible code, but if your data look like this (where the second one does not meet the criteria and should be removed, but the first and third should be kept):
have_list <- list(list1 = list(n.used = 123,
acf = c(-1, 0.2, 0.3, 12),
ignore = LETTERS),
list2 = list(n.used = 321,
acf = seq(10, 20, 0.1),
ignore = letters),
list3 = list(n.used = 111,
acf = seq(-1, 1, 0.01),
ignore = 1:26))
You can can try the above described approach like this:
# create cutoffs
cutoffs <- unlist(lapply(have_list, function(x) 2 / sqrt(x[["n.used"]])))
# list1 list2 list3
# 0.1803339 0.1116291 0.1898316
# create keep index
keep_index <- unlist(lapply(have_list, function(x) {
any(sapply(seq_along(have_list), function(y) {
cutoffs[y] >= min(x[["acf"]]) & cutoffs[y] <= max(x[["acf"]])
# Or for dplyr
# dplyr::between(cutoffs[y], min(x[["acf"]]), max(x[["acf"]]))
}))
}))
# list1 list2 list3
# TRUE FALSE TRUE
new_list <- have_list[keep_index]
# keeps only list1 and list3