In R, I used Rosner's test EnvStats::rosnerTest
to identify outliers in my dataset. My end goal is to have a vector of row numbers in my dataset where Outlier = TRUE
.
var.ros.test <- rosnerTest(df$var, k = 20)
class(var.ros.test)
[1] "gofOutlier"
print(var.ros.test$all.stats)
This is the output of var.ros.test$all.stats
. I highlighted the Obs.Num that I want to save in a vector when Outlier = TRUE
.
I started this code but I am stuck because this code returns all Obs.Num, when I only want Obs.Num when Outlier = TRUE
.
var.out <- subset(var.ros.test$all.stats, select = "Obs.Num")
print(var.out)
Obs.Num
1 977
2 91
3 384
4 97
5 281
.
.
.
> class(var.out)
[1] "data.frame"
> var.out <- var.out[,1]
> var.out <- as.vector(var.out) %>% unlist(var.out)
> print(var.out)
[1] 977 91 384 97 281 65 512 331 6 1041 39
[12] 2 147 69 856 133 329 577 1017 104
Which would be fine, except this contains Row.Obs where Outlier = FALSE
.
Previously, I tried doing this:
var.out <- subset(var.ros.test$all.stats, select = "Obs.Num") %>%
filter(var.ros.test$all.stats, Outlier == "TRUE")
But I get "Error in filter(): ℹ In argument: var.ros.test$all.stats. Caused by error:! ..1$i must be a logical vector, not a double vector."
I would greatly appreciate any tips on how to get a vector of Row.Obs when Outlier = TRUE
. Thank you so much!
Not sure if this is the only problem, but you're using a character string "TRUE" for filtering. I assume Outlier is a logical boolean vector.
If this is your data
set.seed(42)
df <- data.frame(dat = 1:10,
sam = sample(10),
Outlier = sample(c(T,F), 10, replace=T))
df
dat sam Outlier
1 1 1 TRUE
2 2 5 FALSE
3 3 10 TRUE
4 4 8 FALSE
5 5 2 TRUE
6 6 4 TRUE
7 7 6 FALSE
8 8 9 FALSE
9 9 7 FALSE
10 10 3 FALSE
with Outlier s class being
class(df$Outlier)
[1] "logical"
then filter
only needs the variable, since it's already a logical boolean.
df %>% filter(Outlier)
dat sam Outlier
1 1 1 TRUE
2 3 10 TRUE
3 5 2 TRUE
4 6 4 TRUE