Search code examples
rdplyrconfidence-intervalsample-size

How to obtain length of column using dplyr when coding for confidence interval in R


My data set can be referred to here. In gist, I have a column called fitted for which I need to plot the mean and confidence intervals.

I am trying to code for confidence interval using dplyr function for my ggplot

data.melt$time = factor(data.melt$time, levels=paste("t", seq(0, 10), sep=""))

Here is the code

summary_dat = data.melt$time  %>%
              group_by(resource, fertilizer, time) %>%
              summarise(mean_predict=mean(fitted),
                        sd_predict = sd(fitted),
                        n_predict = n(fitted)) %>%

  mutate(se = sd_predict / sqrt(n_predict),
         lower_ci = mean_predict - qt(1 - (0.05 / 2), n_predict - 1) * se_predict,
         upper_ci = mean_predict + qt(1 - (0.05 / 2), n_predict - 1) * se_predict)

However, R does not allow me to code n_predict as n(fitted). I also tried length(fitted) but no luck. Any ideas?


Solution

  • The convenience function n() in dplyr only counts the number of rows in the subset, rather than the length of one of the columns in the subset. It does not take arguments. You want to use either n_predict = n() or n_predict = length(predict).