I have four columns that contain numberical values (hundreds of rows). I would like to plot a bar chart that shows the average value of each of those columns on one chart. So it would show 4 bars on one bar chart, and each bar would represent one column.
Columns are veryactive, fairlyactive, lightlyactive, sedentary. I already know the mean for each column with the summary function, but I want to plot it on a chart. Do I need another variable for one of the other axis?
I was able to plot one of the columns in a bar chart and showing calories as the x axis, but I would just like to compare the mean for each column within a bar chart.
ggplot(Activity_Zero, aes(x = calories, y = veryactive))+
stat_summary(geom = 'bar', fun.y = 'mean')
Here is a sample of my data: tibble of my data
It depends on how your data are formatted. I've provided two examples below.
If you're starting with a table with the summary values, you can do this
library(ggplot2)
levels <- c("veryactive", "fairlyactive", "lightlyactive", "sedentary")
df1 <- data.frame(activitylevel = factor(levels, levels = levels),
meancalories = c(3000, 2500, 2000, 1500))
ggplot(df1, aes(x = activitylevel, y = meancalories)) +
geom_col()
Created on 2023-01-30 by the reprex package (v2.0.1)
And if you're starting with your original data in long form, you can do this.
library(ggplot2)
levels <- c("veryactive", "fairlyactive", "lightlyactive", "sedentary")
df2 <- data.frame(activitylevel = factor(rep(levels,
each = 20), levels = levels),
calories = c(rnorm(20, 3000, 100),
rnorm(20, 2500, 100),
rnorm(20, 2000, 100),
rnorm(20, 1500, 100))
)
ggplot(df2, aes(x = activitylevel, y = calories)) +
stat_summary(geom = "col", fun = "mean")
Created on 2023-01-30 by the reprex package (v2.0.1)
Finally, if you're starting with your data in wide form (i.e. a column for each activity level) then I'd suggest you look up the function tidyr::pivot_longer
, which will wrangle your data into the form required for stat_summary
.