I try to program a way to remove outliers from a linear model. I want to be more flexible about the formulas I use for this purpose. But it does not work.
require(caret)
random_samples <- createDataPartition(iris$Sepal.Length, times=10, p=0.8)
getTrainTest <- function(Index, data){
train_data <- data[Index, ] # test_data = Umfang von test_rowLocations --> Datensatz k
test_data <- data[-Index, ] # training data = OG data frame - test data
return(list("train"=train_data, "test"=test_data))
}
datasets <- lapply(random_samples, getTrainTest, iris)
forumla1 <- as.formula(Sepal.Length ~ Petal.Length)
compute_cooks_models <- function(x,eq){
cooks.distance(lm(eq,
data = x, na.action = na.exclude))}
result <- Map (compute_cooks_models,datasets, eq=forumla1)
Error: object of type 'symbol' is not subsettable
I don't get what I am doing wrong??
Could some one help me out? Nadine
You have a couple of issues there in your code.
datasets
is a list of lists of dataframes, so when you loop through them with Map
you are looping through the first level, thus passing in a list to the function compute_cooks_models
. If you'd like to train the lm
model with the training set then you have to use x$train
in the argument data
The second issue is with the use of Map
this function assumes that you're passing a vector or a list of values for each argument in the function. An example can be the following:
my_fun <- function(x, y){
paste0(x, y)
}
Map(my_fun, letters[1:5], 1:5)
## Output:
# $a
# [1] "a1"
#
# $b
# [1] "b2"
#
# $c
# [1] "c3"
#
# $d
# [1] "d4"
#
# $e
# [1] "e5"
This means in your case that the function is trying to get the first
element from datasets
and the first element from forumla1
, which will of course cause an error when passing one symbol of the formula to the lm
call.
You could instead use sapply
which will do wht you need I think, like so:
forumla1 <- as.formula(Sepal.Length ~ Petal.Length)
compute_cooks_models <- function(x,eq){
cooks.distance(lm(eq, data = x$train, na.action = na.exclude))
}
result <- sapply(datasets, compute_cooks_models, eq=forumla1)