I created a function so I can perform multiple statistical procedures for several values in a dataset: 1) paired t-test, 2) create a simple boxplot, 3) get summary statistics of patient measures before and after intervention.
Here is my code so far:
stat_func<-function(df, var){
t.test(data=df, var ~ timepoint, paired=TRUE)
boxplot(data=df,
var ~ timepoint,
col = c("pink", "#EFC000FF"),
xlab="",
ylab="Percent Time Between 70-140")
df%>%
dplyr::group_by(timepoint)%>%
dplyr::summarise(min=round(min(var), 2), mean=round(mean(var), 2), max=round(max(var), 2), sd=round(sd(var), 2))
}
When I run this code, RStudio doesn't show the output for the t-test, it DOES create the boxplot, and the output for dplyr::summarise is identical. It doesn't recognize the "before" versus "after" for the timepoint variable.
Here is example data for one of my variables of interest:
have<-as.data.frame(structure(list(subjectid=structure(c(1, 1, 2, 2, 3, 3, 4,4)),
timepoint=structure(c("before", "after", "before", "after", "before", "after", "before", "after")),
estimated_a1c=structure(c(10, 7.5, 10.5, 9.4, 9.8, 7, 9.9, 7.3)))))
When I run my function and then write stat_func(have, have$estimated_a1c)
to get the output, in the RStudio console, I don't see any t-test output, it generates the boxplot, and then my summarise results are this but they shouldn't be identical:
<chr> <dbl> <dbl> <dbl> <dbl>
1 after 7 8.93 10.5 1.41
2 before 7 8.93 10.5 1.41
Any suggestions for how to get t-test & correct group-by/summarise output would be much appreciated.
Thank you!
Here's a shot at what I think you're getting at.
library(dplyr)
df <- as.data.frame(structure(
list(
subjectid = structure(c(1, 1, 2, 2, 3, 3, 4, 4)),
timepoint = structure(
c(
"before",
"after",
"before",
"after",
"before",
"after",
"before",
"after"
)
),
estimated_a1c = structure(c(10, 7.5, 10.5, 9.4, 9.8, 7, 9.9, 7.3))
)
))
# Define function
stat_func <- function(df, var, time) {
# print boxplot to console
boxplot(
formula = as.formula(paste(var, "~", time)),
data = df,
col = c("pink", "#EFC000FF"),
xlab = "",
ylab = "Percent Time Between 70-140"
)
# results list for t.test and summary df
results <- list()
# save t test to list
results$t.test <- t.test(
df[[var]][df[[time]] == "before"],
df[[var]][df[[time]] == "after"],
paired = TRUE
)
# save summary df to list
results$summary <- df %>%
dplyr::group_by(!!sym(time)) %>%
dplyr::summarise(
min = round(min(!!sym(var)), 2),
mean = round(mean(!!sym(var)), 2),
max = round(max(!!sym(var)), 2),
sd = round(sd(!!sym(var)), 2)
)
return(results)
}
stat_func(df = df, var = 'estimated_a1c', time = 'timepoint')
$t.test
Paired t-test
data: df[[var]][df[[time]] == "before"] and df[[var]][df[[time]] == "after"]
t = 5.7934, df = 3, p-value = 0.01023
alternative hypothesis: true mean difference is not equal to 0
95 percent confidence interval:
1.014025 3.485975
sample estimates:
mean difference
2.25
$summary
# A tibble: 2 × 5
timepoint min mean max sd
<chr> <dbl> <dbl> <dbl> <dbl>
1 after 7 7.8 9.4 1.09
2 before 9.8 10.0 10.5 0.31
The boxplot should appear in the plots pane, while a list containing the t.test and summary df should print to the console. If you want the boxplot included in the list to save, I have better luck with saving ggplot figures to lists than base boxplots, but I'm sure there's a way.
A couple of things to note: