I know this gets asked a lot, but I'm having trouble making a 100% stacked bar plot in R. I know there are tons of pages out there explaining how, but nothing is working and I think the data I'm importing isn't configured correctly, so basically I want to know what I'm doing wrong in that respect. The data I'm using looks like the data in the attached picture. I'm able to create the exact chart I want in Excel, which I've also attached (the bar graph on the right; I couldn't attach more than one picture so they're just both in the same one), but for various reasons I need it to be in R. Is the way the data is written in Excel incorrect, and if so, how do I make it right?
In ggplot2
at least, you need to convert your data from "wide" to "long" format. Below, I use the tidyr::gather
function to "gather" the two data columns ("running" and "jumping") into a single "fraction" column, which you can then color by "activity".
library(magrittr) # For pipe (%>%)
dat <- tibble::tibble(
weeks = 1:15,
running = runif(15, 0, 1),
jumping = 1 - running
)
dat
#> # A tibble: 15 x 3
#> weeks running jumping
#> <int> <dbl> <dbl>
#> 1 1 0.675 0.325
#> 2 2 0.727 0.273
#> 3 3 0.430 0.570
#> 4 4 0.324 0.676
#> 5 5 0.809 0.191
#> 6 6 0.260 0.740
#> 7 7 0.433 0.567
#> 8 8 0.872 0.128
#> 9 9 0.0288 0.971
#> 10 10 0.903 0.0970
#> 11 11 0.295 0.705
#> 12 12 0.538 0.462
#> 13 13 0.342 0.658
#> 14 14 0.291 0.709
#> 15 15 0.877 0.123
library(ggplot2)
dat_long <- dat %>%
tidyr::gather(activity, fraction, running, jumping)
dat_long
#> # A tibble: 30 x 3
#> weeks activity fraction
#> <int> <chr> <dbl>
#> 1 1 running 0.675
#> 2 2 running 0.727
#> 3 3 running 0.430
#> 4 4 running 0.324
#> 5 5 running 0.809
#> 6 6 running 0.260
#> 7 7 running 0.433
#> 8 8 running 0.872
#> 9 9 running 0.0288
#> 10 10 running 0.903
#> # ... with 20 more rows
ggplot(dat_long) +
aes(x = factor(weeks), y = fraction, fill = activity) +
geom_col()
You can also do this in base R by converting to a "wide" matrix. (Note that I also use [, -1]
to drop the first column).
dat_tmat <- t(as.matrix(dat[, -1]))
dat_tmat
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> running 0.5227949 0.5352537 0.5879579 0.2678927 0.93068128 0.2948861
#> jumping 0.4772051 0.4647463 0.4120421 0.7321073 0.06931872 0.7051139
#> [,7] [,8] [,9] [,10] [,11] [,12]
#> running 0.07729363 0.8925416 0.5503279 0.007479232 0.02991765 0.5832765
#> jumping 0.92270637 0.1074584 0.4496721 0.992520768 0.97008235 0.4167235
#> [,13] [,14] [,15]
#> running 0.8660134 0.1156794 0.3176998
#> jumping 0.1339866 0.8843206 0.6823002
barplot(dat_tmat, col = c("blue", "red"))
legend("topleft", c("running", "jumping"), col = c("blue", "red"), lwd = 5, bg = "white")