I have a group of 10 subjects (using a subset of 3 of them below as a sample dataset). I have managed to create a for loop that runs through the 3 subjects and fits each one to a sigmoidal curve using nlsLM (package minpack.lm). nlsLM is the same as nls, it just uses a Levenberg-Marquardt algorithm instead of the Gauss-Newton algorithm used with the base nls function.
I have also manage to extract the model parameters (plateau, S50, slope) and put them in a dataframe. My code is as follows:
dfRC <- read_csv("data/sampleRCdf.csv") #read in the raw data files
# list of subjects of interest to loop through
sub <- unique(dfRC$subID)
# define start to obtain the number and name of fit parameters
start <- list(plateau=1, S50=1, slope=1)
# create empty data.frame to store IDs and parameters
params.pre <- data.frame(matrix(nrow = length(sub), ncol = 1+length(start)))
names(params.pre) <- c("sub", names(start))
# nested for loop that goes by subject (i)
for(i in seq_along(sub)) {
# create data frame for sub "i"
individual_DFs <- dfRC %>% filter (subID %in% sub[i])
# fit model for each sub "i"
fitpre.loop <- nlsLM(mepAMP_pre ~ plateau / (1 + exp(slope*(S50 - state))),
data = individual_DFs,
start = start, trace = TRUE)
# store IDs
params.pre[i,1] <- sub[i]
# store fit parameters
params.pre[i,2:ncol(params)] <- fitpre.loop$m$getPars()
}
params.pre
params.pre
gives:
sub plateau S50 slope
101 3.579751 6.505194 0.6930363
202 2.506159 3.538753 0.8300668
303 1.971020 5.888228 0.4806047
This is for the following dataset (where 101, 202, and 303 are my sample subjects), where 101.y.pre is mepAMP_pre
and x is state
in the for loop above:
x = c(1,2,3,4,5,6,7,8,9,10,11)
101.y.pre <- c(0.38117575, 0.11300747, 0.37239499, 0.51321839, 0.56851893,
1.73259296, 2.08146847, 2.80090151, 3.04446933, 2.67647473, 3.87695509)
202.y.pre <- c(0.263931535, 0.554056564, 0.903243066, 1.758670072, 1.512232414,
2.382228869, 2.744255537, 1.943985522, 2.642561877, 2.880719751, 2.139018852)
303.y.pre <- c(0.197647216, 0.095434883, 0.523944806, 0.625025631, 0.92489588,
0.898288637, 0.918388724, 1.433502882, 2.127665395, 1.649622992, 1.642610593)
I had a few questions.
I did a sanity check where I ran each subject individually through nlsLM and made sure the output parameters from that matched the output from my for loop. They do! So, it works. However, I'm a little confused about the output of the individual_DFs
dataframe. I thought it would show all 3 subjects, but when I open it up it just shows subject 303. Clearly the code works, but I'm confused why only 1 subject is showing up in the dataframe?
You'll notice I named the param output as params.pre
and use the mepAMP_pre
values for the model - this is because I have both pre- and post-treatment data for each subject. The .csv file has the column headings "subID", "state", "mepAMP_pre", and "mepAMP_post". Is there a way to use a for loop to loop over both the pre and post values and spit out plateau_pre, plateau_post, S50_pre, S50_post, slope_pre, slope_post
and pull them all out into a single dataframe like I currently have with params.pre
with the different subjects as rows?
Any advice on how I could graph it? I have managed to graph the pre- fit for one subject using the following code:
df101 <- data.frame(x, fit101.y)
p.sample <- ggplot(data = df101, aes(x = x, y = fit101.y)) +
geom_point() +
geom_smooth(method = "nls",
data = df101,
formula = fit101.y ~ plateau / (1 + exp(slope*(S50 - x))), start = list( plateau=1, S50=1, slope=1),
se = FALSE)
p.sample
I'd like to maybe overlap the curves for each subject pre- and post-per subject in a grid-like pattern of figures. Or, perhaps graph all the pre- curves onto one figure and all post- curves onto another figure. Would appreciate any help with doing either of these!
---------------EDIT---------------
Thank you Allan for your help! Here is my full data frame (output from dput(dfRC)
)
structure(list(subID = c(101, 101, 101, 101, 101, 101, 101, 101,
101, 101, 101, 202, 202, 202, 202, 202, 202, 202, 202, 202, 202,
202, 303, 303, 303, 303, 303, 303, 303, 303, 303, 303, 303),
state = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 1, 2, 3, 4,
5, 6, 7, 8, 9, 10, 11, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11),
mepAMP_pre = c(0.38117575, 0.11300747, 0.37239499, 0.51321839,
0.56851893, 1.73259296, 2.08146847, 2.80090151, 3.04446933,
2.67647473, 3.87695509, 0.263931535, 0.554056564, 0.903243066,
1.758670072, 1.512232414, 2.382228869, 2.744255537, 1.943985522,
2.642561877, 2.880719751, 2.139018852, 0.197647216, 0.095434883,
0.523944806, 0.625025631, 0.92489588, 0.898288637, 0.918388724,
1.433502882, 2.127665395, 1.649622992, 1.642610593), mepAMP_post = c(0.126321776,
0.566816552, 0.374254417, 0.199486984, 0.510302018, 1.03651474,
1.697137046, 2.090100867, 3.448320717, 2.095180146, 2.897606435,
0.018846444, 0.041664734, 0.51243325, 0.961881685, 0.998366952,
2.082848001, 2.713030559, 3.373811346, 2.839989549, 3.283945894,
3.052075374, 0.232427913, 0.895619231, 1.194016429, 1.721528554,
2.249776715, 2.756416541, 4.716890788, 4.16244235, 4.757734573,
4.965043759, 4.732616496)), row.names = c(NA, -33L), spec = structure(list(
cols = list(subID = structure(list(), class = c("collector_double",
"collector")), state = structure(list(), class = c("collector_double",
"collector")), mepAMP_pre = structure(list(), class = c("collector_double",
"collector")), mepAMP_post = structure(list(), class = c("collector_double",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), delim = ","), class = "col_spec"), problems = <pointer: 0x10d8e15d0>, class = c("spec_tbl_df",
"tbl_df", "tbl", "data.frame"))
I tried running the code you included and the updated params.pre worked great, however the plot came out weird and I'm not sure where I'm going wrong.
The code:
params.pre <- do.call("rbind", lapply(split(dfRC, dfRC$subID), function(d) {
mod <- nlsLM(mepAMP_pre ~ plateau / (1 + exp(slope*(S50 - state))),
data = d, start = list(plateau = 1, S50 = 1, slope = 1))
as.data.frame(c(list(sub = d$subID[1]), as.list(coef(mod))))
}))
ggplot(dfRC, aes(state, mepAMP_pre, color = subID)) +
geom_point() +
geom_smooth(method = nlsLM, se = FALSE,
formula = mepAMP_pre ~ plateau / (1 + exp(slope*(S50 - state))),
method.args = list(start = list(plateau = 1, S50 = 1, slope = 1)))
You don't need to chop up dfRC
to get the plot working. First pivot into long format:
library(minpack.lm)
library(tidyverse)
dfRC_long <- dfRC %>%
pivot_longer(contains('mep'), names_sep = '_',
names_to = c('.value', 'prepost')) %>%
mutate(subID = factor(subID), prepost = factor(prepost, c('pre', 'post')))
Now your plotting code is just:
ggplot(dfRC_long, aes(state, mepAMP, colour = subID)) +
geom_point() +
geom_smooth(method = nlsLM, se = FALSE,
formula = y ~ plateau / (1 + exp(slope*(S50 - x))),
method.args = list(start = list(plateau = 1, S50 = 1, slope = 1))
) +
facet_grid(.~prepost)
Similarly, you can skip an explicit loop to generate your table using some tidyverse functions:
dfRC_long %>%
group_by(subID, prepost) %>%
group_map(.f = ~ nlsLM(mepAMP ~ plateau / (1 + exp(slope*(S50 - state))),
data = .x,
start = list(plateau = 1, S50 = 1, slope = 1)) %>%
coef() %>%
t() %>%
as.data.frame() %>%
mutate(pre_or_post = .y$prepost, .before = 1) %>%
mutate(subID = .y$subID, .before = 2)) %>%
bind_rows() %>%
arrange(pre_or_post, subID)
#> pre_or_post subID plateau S50 slope
#> 1 pre 101 3.579751 6.505194 0.6930363
#> 2 pre 202 2.506159 3.538753 0.8300668
#> 3 pre 303 1.971020 5.888228 0.4806047
#> 4 post 101 2.874621 6.538601 0.9221484
#> 5 post 202 3.225695 5.356826 0.9406343
#> 6 post 303 5.084059 5.094672 0.6321269
Created on 2023-08-16 with reprex v2.0.2