I need to fit many loess splines by the grouping variable (Animal) across multiple numeric columns (Var1, Var2), and extract these values.
I found code to do this task one variable at a time;
# Create dataframe 1
OneVarDF <- data.frame(Day = c(replicate(1,sample(1:50,200,rep=TRUE))),
Animal = c(c(replicate(100,"Greyhound"), c(replicate(100,"Horse")))),
Var1 = c(c(replicate(1,sample(2:10,100,rep=TRUE))), c(replicate(1,sample(15:20,100,rep=TRUE)))))
library(dplyr)
library(tidyr)
library(purrr)
# Get fitted values from each model
Models <- OneVarDF %>%
tidyr::nest(-Animal) %>%
dplyr::mutate(m = purrr::map(data, loess, formula = Var1 ~ Day, span = 0.30),
fitted = purrr::map(m, `[[`, "fitted")
)
# Create prediction column
Results <- Models %>%
dplyr::select(-m) %>%
tidyr::unnest()
This "Results" dataframe is essential for downstream tasks (detrending many non-parametric distributions).
How can we achieve this with a dataframe with multiple numeric columns (code below), and extract a "Results" dataframe? Thank you.
# Create dataframe 2
TwoVarDF <- data.frame(Day = c(replicate(1,sample(1:50,200,rep=TRUE))),
Animal = c(c(replicate(100,"Greyhound"), c(replicate(100,"Horse")))),
Var1 = c(c(replicate(1,sample(2:10,100,rep=TRUE))), c(replicate(1,sample(15:20,100,rep=TRUE)))),
Var2 = c(c(replicate(1,sample(22:27,100,rep=TRUE))), c(replicate(1,sample(29:35,100,rep=TRUE)))))
We can get the data in long format using. pivot_longer
, group_by
Animal
and column name and apply loess
to each combinaton.
library(dplyr)
library(tidyr)
TwoVarDF %>%
pivot_longer(cols = starts_with('Var')) %>%
group_by(Animal, name) %>%
mutate(model = loess(value~Day, span = 0.3)$fitted)